0

.txt result wanted, from xml list columns

Good afternoon everyone,

I am having some difficulties in continuing with the code and I need your help.

Basically I have a file in .xml that it is opening as a table and I try to search only 3 columns that start with these fields:

NAME17 | POINTS_COUNT | Pt

In the "NAME17" fields, there are names that are repeated in many lines.

The goal is to always start each field listing of "NAME17" with

"START" and field name

"P" and number of existing coordinates for the field

after the last coordinate (field line) places

"END" (that is, between P 343 and "END" we will have 343 lines with such x, y and z coordinates)

and then, after placing the END P 343, put a new cycle, with the name of the 2 field of "NAME17" and put "START" and name of the next field again ... and repeat ...

"P" and number of coordinates that exist for the "RAIADOR" field and then all coordinates ...

and again "END RADIATOR" and new cycle with "P000", etc.

START EOP
P 343
2181.1800 673.4920 -864.6050
END EOP
START RAIADOR
P 354
2212.6300 660.3580 -886.8900
...
END RAIADOR
START...

Below the code already developed, in which I am having trouble continuing to create the reading cycles (for), inserting lines with the Starts / END and the P and No. and coordinates ... the final result should be recorded as a .txt separated by tabs for the case of the coordinates (x, y and z).

The .xml file and the final result in .txt(tabs) wanted is here: https://filesend.standardnotes.org/send/ROsr3jwXX5aXULYnNSrx#MjUzYjhjOGFkNjUyNjY5MTE1MmZm

Any Help?

` Sub MACRO_TXT_TAB() Application.DisplayAlerts = False

ActiveWindow.SmallScroll ToRight:=48

Range("Tabela1[[#Headers],[NAME17]]").EntireColumn.Select
Selection.Copy
Sheets.Add After:=ActiveSheet
ActiveSheet.Paste
Sheets("Folha1").Select
Application.CutCopyMode = False


Range("Tabela1[[#Headers],[POINTS_COUNT]]").EntireColumn.Select
Selection.Copy
Sheets("Folha2").Select
Range("B1").Select
ActiveSheet.Paste
Sheets("Folha1").Select
Application.CutCopyMode = False

Range("Tabela1[[#Headers],[Pt]]").EntireColumn.Select
Selection.Copy
Sheets("Folha2").Select
Range("C1").Select
ActiveSheet.Paste
Range("A1:C1").Select
Application.CutCopyMode = False
Columns("A:A").Select
Selection.SpecialCells(xlCellTypeBlanks).Select
Selection.EntireRow.Delete
Range("a1").Select

Columns("A:A").Select
Selection.Copy
Range("F1").Select
ActiveSheet.Paste
Application.CutCopyMode = False
ActiveSheet.Range("$F$1:$F$50000").RemoveDuplicates Columns:=1, Header:= _
    xlYes
Sheets.Add After:=ActiveSheet
ActiveCell.FormulaR1C1 = "=""START ""&Folha2!R2C6"
Range("A2").Select
ActiveCell.FormulaR1C1 = "=""P""&Folha2!RC[1]+""1"""
Range("A3").Select

Sheets("Folha2").Select
Sheets("Folha2").Activate


Dim i, NLinhas2, Nlinhas3 As Integer
Dim nome As String


i = 0
NLinhas2 = Range("A65536").End(xlUp).Row
Nlinhas3 = 3

For i = 1 To NLinhas2

nome = Range("A2").Value 'coloca primeiro nome
If (nome) = Range("A" & i + 1) Then
Worksheets("Folha3").Cells(Nlinhas3, 1) = Worksheets("Folha2").Cells(i + 1, 3)
Else: Exit For
End If

Nlinhas3 = Nlinhas3 + 1

Next i


Application.DisplayAlerts = True
End Sub

`

this is the values, i'm run the original code in sheet 1 of original .xml file

could be the error this variable m?

It's continues to stop with a error in this line
{ Err.Raise vbObjectError + 1, "LoadData", "Invalid data format in cell " & _ .Cells(dataRange.Rows(1).row + row - 1, dataRange.Columns(1).Column + lastCol - 1).Address }

and no variable m and definition next

This is my file with diferent columns for data, with start in row 10

This is the last result with the modification code

It continuous to do the error in xval = m(0).SubMatches(0) - Type mismatch

'Option Explicit

' Function reads the given range and returns a dictionary.
' key is the first column of the selected range,
' value is a list of arrays (x, y, z).
Function LoadData(ByVal dataRange As Range) As Dictionary

Dim dict As Dictionary
Dim re As VBScript_RegExp_55.RegExp
Dim m As VBScript_RegExp_55.MatchCollection
Dim rawData As Variant
Dim lastCol As Long
Dim row As Long
Dim name17 As String
Dim coordinateCell As Range
Dim coordinates As String
Dim xval As Double
Dim yval As Double
Dim zval As Double

' Initialize dictionary.
Set dict = New Dictionary

' Initialize regular expression.
Set re = New RegExp
re.Pattern = "^\s*X=(-?\d+\.\d+)\s+Y=(-?\d+\.\d+)\s+Z=(-?\d+\.\d+)\s*$"

' Get the coordinates from the last column of the data range.
lastCol = dataRange.Columns.Count

' Copy selected range to array for faster processing.
rawData = dataRange.Value
   
' For each row in data array...
For row = LBound(rawData, 1) To UBound(rawData, 1)
    
    ' Get name17 from the first column of the data range.
    name17 = rawData(row, 1)
    
    ' Add name17 to dictionary if it doesn't already exist.
    If Not dict.Exists(name17) Then
        dict.Add name17, New Collection
    End If

    ' Get the coordinates from the last column of the data range.
    coordinates = rawData(row, lastCol)
    
    ' Use regular expression to parse coordinates.
    If Not re.Test(coordinates) Then
        With dataRange.Worksheet
            Set coordinateCell = .Cells(dataRange.Rows(1).row + row - 1, _
                dataRange.Columns(1).Column + lastCol - 1)
        End With
        Err.Raise vbObjectError + 1, "LoadData", "Invalid data format '" & _
            coordinates & "' in cell " & coordinateCell.Address
    End If

    Set m = re.Execute(coordinates)
    xval = m(0).SubMatches(0)
    yval = m(0).SubMatches(1)
    zval = m(0).SubMatches(2)
    
    ' Add data value.
    dict(name17).Add Array(xval, yval, zval)
    
Next row

Set LoadData = dict

End Function '

and the test sub:

' Sub Test()

Dim dataRange As Range
Dim data As Dictionary
Dim name17 As Variant
Dim val As Variant

Set dataRange = Worksheets("Sheet1").Range("BG10:BV365")
'Set dataRange = Range("BG10:BV365")
Set data = LoadData(dataRange)
For Each name17 In data.Keys
    Debug.Print "START " & name17
    Debug.Print "P " & data(name17).Count
    For Each val In data(name17)
        Debug.Print val(0), val(1), val(2)
    Next val
    Debug.Print "END " & name17
Next name17

End Sub ' 

This is my code, i compile, and nothing happens, i still get the error in m(0) file. But if i change the coordinate X=2181.18 Y=673.492 Z=-864.605 to decimal: X=2181,18 Y=673,492 Z=-864,605 the error is in the line:

            Err.Raise vbObjectError + 1, "LoadData", "Invalid data format '" & _
            coordinates & "' in cell " & coordinateCell.Address

with the message: "Invalid data format (x,y,z) in cell $BV$10 "

Pedro23
  • 1
  • 3
  • Welcome to Stack Overflow. Your question is confusing and appears to contain a lot of irrelevant detail. If you try provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) of the problem, you might find that more people will be willing to help. – Nicholas Hunter Apr 20 '21 at 21:01
  • Additionally, please read [this reference on avoiding Select](https://stackoverflow.com/q/10714251/3076048). It will get you started simplifying your code thus making it easier to quickly see your code logic and any bugs. – Mark Balhoff Apr 20 '21 at 21:07
  • Thanks for the advices. Now i have update the question and have included the final result wanted in .txt, from the link to download the files. – Pedro23 Apr 20 '21 at 21:17
  • I don't think many people are going to want to download a file to see your output. You should copy and paste as text into your question. – Nicholas Hunter Apr 20 '21 at 23:02
  • Hi @NicholasHunter, i have that in my text, and now i put a image... thanks a lot for the examples. i learn a lot what i see! – Pedro23 Apr 21 '21 at 07:27

1 Answers1

1

This function reads the selected data range and returns an dictionary of names and parsed coordinate values.

Option Explicit

' Function reads the given range and returns a dictionary.
'   key is the first column of the selected range,
'   value is a list of arrays (x, y, z).
Function LoadData(ByVal dataRange As Range) As Dictionary

    Dim dict As Dictionary
    Dim re As VBScript_RegExp_55.RegExp
    Dim m As VBScript_RegExp_55.MatchCollection
    Dim rawData As Variant
    Dim lastCol As Long
    Dim row As Long
    Dim name17 As String
    Dim coordinateCell As Range
    Dim coordinates As String
    Dim xval As Double
    Dim yval As Double
    Dim zval As Double
    
    ' Initialize dictionary.
    Set dict = New Dictionary
    
    ' Initialize regular expression.
    Set re = New RegExp
    re.Pattern = "^\s*X=(-?\d+\.\d+)\s+Y=(-?\d+\.\d+)\s+Z=(-?\d+\.\d+)\s*$"
    
    ' Get the coordinates from the last column of the data range.
    lastCol = dataRange.Columns.Count
    
    ' Copy selected range to array for faster processing.
    rawData = dataRange.Value
       
    ' For each row in data array...
    For row = LBound(rawData, 1) To UBound(rawData, 1)
        
        ' Get name17 from the first column of the data range.
        name17 = rawData(row, 1)
        
        ' Add name17 to dictionary if it doesn't already exist.
        If Not dict.Exists(name17) Then
            dict.Add name17, New Collection
        End If
    
        ' Get the coordinates from the last column of the data range.
        coordinates = rawData(row, lastCol)
        
        ' Use regular expression to parse coordinates.
        If Not re.Test(coordinates) Then
            With dataRange.Worksheet
                Set coordinateCell = .Cells(dataRange.Rows(1).row + row - 1, _
                    dataRange.Columns(1).Column + lastCol - 1)
            End With
            Err.Raise vbObjectError + 1, "LoadData", "Invalid data format '" & _
                coordinates & "' in cell " & coordinateCell.Address
        End If

        Set m = re.Execute(coordinates)
        xval = m(0).SubMatches(0)
        yval = m(0).SubMatches(1)
        zval = m(0).SubMatches(2)
        
        ' Add data value.
        dict(name17).Add Array(xval, yval, zval)
        
    Next row
    
    Set LoadData = dict

End Function

This is how you would use it.

Sub Test()

    Dim dataRange As Range
    Dim data As Dictionary
    Dim name17 As Variant
    Dim val As Variant
    
    Set dataRange = Worksheet("Sheet1").Range("BG10:BV365")
    Set data = LoadData(dataRange)
    For Each name17 In data.Keys
        Debug.Print "START " & name17
        Debug.Print "P " & data(name17).Count
        For Each val In data(name17)
            Debug.Print val(0), val(1), val(2)
        Next val
        Debug.Print "END " & name17
    Next name17
    
End Sub

You will need to add references to the following libraries to make this work.

Tools/References Add reference

This is the result

enter image description here

Nicholas Hunter
  • 1,791
  • 1
  • 11
  • 14
  • Hi, thanks for the answer... that code is another level for me!!! when you write: "This function reads the selected data range and returns an dictionary of names and parsed coordinate values. You need to select the data range before this function is executed." you mean that i need only to select the Column "NAME17" in my file? – Pedro23 Apr 22 '21 at 12:05
  • I have updated the code to make it easier for you to understand how to use it. You could also select the data range and then replace **Set data = LoadData(dataRange)** with **Set data = LoadData(Selection)**. Try it out and see if you can make it work for you. – Nicholas Hunter Apr 22 '21 at 12:36
  • Thanks!! I run your code in my original .xml file with (dateRange) but i have a problem in the line "xval = m(0).SubMatches(0)" with the "Run-time error '5' - Invalid procedure call or argument. – Pedro23 Apr 22 '21 at 14:48
  • it continues to give a error in the "xval = m(0)... – Pedro23 Apr 23 '21 at 16:22
  • What is the value of coordinates when you get the error? – Nicholas Hunter Apr 23 '21 at 22:25
  • I put a image in my answer with all values. I'm running the code in sheet 1 of original file. x=2181,18 – Pedro23 Apr 24 '21 at 10:24
  • I have updated the code to throw an error if it encounters any invalid values in the data range. Give it a try and see if it helps you find the problem. – Nicholas Hunter Apr 24 '21 at 12:30
  • It's gives a error.. Run-time error '-2147221503 (80040001)' Invalid data format in cell $BV$10 That cell have this values: X=2181.18 Y=673.492 Z=-864.605 – Pedro23 Apr 24 '21 at 17:31
  • I have added an image showing the first row of data, the way to call the function, I added a breakpoint to the function and display the values after parsing the coordinates, and the results of running the function on one row. Hopefully something in here will help you figure out what's going wrong on your end because short of remoting in to your desktop I am all out of ideas. Best of luck. – Nicholas Hunter Apr 24 '21 at 18:13
  • Nicholas, you select some columns to run the code? Or compile the sub only in the "Sheet1"? I run your code when i open de .xml file and the date formatted like a table, the data starts on row 10... it's stills give a error in invalid data format in cell BV10, with is the values of x,y, and z... i put the image of my variables when i run the code and a print of my file. – Pedro23 Apr 25 '21 at 17:54
  • I think I see the problem now. I did not account for the fact that the contents of the coordinates cells are packed left and right with spaces. I have updated the code. Give it another try. To answer your question, nothing has to be selected for this code to work. You have to pass in the data range as shown in the example. If you don't know what the data range is, that would be a separate question. – Nicholas Hunter Apr 25 '21 at 19:07
  • Hi, Nicholas, i have try again and now i don't have that error! :) But it stop again in the xval = m(0=.Submatches(0) I have put a new image with all variables and the code line: "This is the last result with the modification code" – Pedro23 Apr 26 '21 at 13:37
  • So it's just stopping by itself? Try opening the VBA editor and select Debug/Remove all breakpoints from the menu. – Nicholas Hunter Apr 26 '21 at 15:02
  • Yes. I put above my code.., I don't now if that my computer work's with decimals with (,) and your's with (.) or not, but if i change that the error change too, and stops in another line. – Pedro23 Apr 26 '21 at 17:20
  • I try but i still get the error, i don't now what to do.. @Nicholas do you think it's easier to copy the columns of data to another sheet and manipulate the date there? Like i put in the first part of my code in the first post? – Pedro23 Apr 27 '21 at 20:52
  • If you're still getting an error it is almost certainly because the format of the data you are passing into the LoadData function is incorrect. Remember the first column of the data range contains the name17 field and the last column of the data contains the coordinate data. The coordinate data must look like this: X=dotY=dotZ=dot. – Nicholas Hunter Apr 27 '21 at 21:15
  • It won't help moving the data to another sheet. It will definitely not work if you replace the dots with commas. I have now definitely run out of ideas. All I can suggest is that you look at the images I posted and see if you can reproduce them EXACTLY. Again, best of luck. – Nicholas Hunter Apr 27 '21 at 21:18