0

My macro works for a good chunk of the team wants, but the problem arises when certain columns don't generate in the same order.

You can test this on any IEEE Xplore advanced search URL (no institutional signup or even personal login is needed for this) such as this (this link automatically selects Per page = 100 to generate at least 100 results)

(You can optionally substitute another source for initial output if you prefer: In the Chrome Dev Tools > Network tab, sort the left-hand column Name parameters, click the one called "source" then select cURL as Bash and export to http://konklone.io/json/ or similar to generate a CSV containing all 100 rows allowed so you're starting with what I provide here for what Excel VBA needs. You can press Ctrl+R to Reload if the Name columns aren't showing what I have here.)

Some of the authors' values (names and partial IEEE path URLs) are generated and the macro generates a new first spreadsheet tab Destination which populates those (replacing the authors/0 id, authors/1 id, values in the column and inserts a hyperlink for those values so the author names hyperlink to the actual IEEE profile). But it seems to break down after the first few author columns (I see the author names but nothing hyperlinks after that point). Clearly, the columns must change sometimes after a given Network > Konklone output and now the macro can't match the right hyperlinks and names. This problem also happens occasionally with some Network/Konklone output columns further to the right. I just need the author (id/0, /1, /2 to the end) links and articles links (documentLink is one URL/value for each row of authors). The good news is the header row of the Network/Konklone output is consistent regardless of how many rows of results shown, so is there a way to exclude generating hyperlinks where no corresponding URL exists in the corresponding offset column/row? (I've read a little about Step -1 as a way to iterate through the first row so one can identify the row headings across the columns, but that is far above my VBA skillset.) Don't assume I'm very good so fine to spell out a solution for a noob! Many thanks in advance!

Sub DOoNOToUSEoIEEEAdvancedSearchReformat()
'starting from a page like https://ieeexplore.ieee.org/search/searchresult.jsp?action=search&matchBoolean=true&queryText=(%22All%20Metadata%22:cuda)%20AND%20(%22All%20Metadata%22:gpu)%20NOT%20(%22Author%20Affiliations%22:nvidia)&highlight=true&returnType=SEARCH&matchPubs=true&rowsPerPage=100&searchWithin=speed&returnFacets=ALL&pageNumber=1
'export it using JSON per How to do IEEExplore Advanced (Network console JSON to Excel).docx

Dim a As Long, i As Long, j As Long, LC As Long, LR As Long, z As Long
Dim cell As Range, M As Range, OneRange As Range, SortCell As Range
Dim celltourl As String, unlinked As String
Dim ws As Worksheet, wd As Worksheet, wa As Worksheet

If ActiveWorkbook.Name = ThisWorkbook.Name Then 'you are in the wrong file because the macro is not stored in the exported IEEExplore file
    MsgBox ("You do not have the IEEExplore Excel file generated by Konklone selected in the foreground. Also make sure:" & vbNewLine & _
    "1) column headings row is the 1st row" & vbNewLine & _
    "2) you didn't delete any other rows copied from Konklone" & vbNewLine & _
    "Then re-run macro by selecting View --> Macros --> View Macros --> select IEEEAdvSearchReformat macro --> Run.")
    Exit Sub
ElseIf ActiveSheet.Cells(1, 1).Value2 <> "authors/0/preferredName" Then
    MsgBox "It appears you have some extra rows above the column headings row, or maybe extra columns preceding Candidate Name. Please delete those then re-run macro."
    Exit Sub

Else
'Copy first sheet to a new sheet per https://docs.microsoft.com/en-us/office/vba/api/excel.worksheet.copy
    Set ws = ActiveSheet
'we will rename its tab later:   ws.Name = "IEEExploreSearchResultCleaned"
    Sheets.Add 'if no argument provided, Excel uses the default values and adds 1 worksheet before the active sheet, and now it is the active sheet
    Set wd = ActiveSheet 'provides a short variable reference to this sheet going forward
    wd.Name = "Destination" 'name its tab as well

'turn off screen updating and unfreeze all worksheets
Application.ScreenUpdating = False
For Each wa In Application.ActiveWorkbook.Worksheets
    wa.Activate
    With Application.ActiveWindow
        .FreezePanes = False
    End With
Next

ws.Activate 'make orig sheet active again before the next code section

'delete all columns containing these strings in the header row
'below adapted from https://www.extendoffice.com/documents/excel/3086-excel-delete-columns-based-on-header.html
Dim xFNum, xFFNum, xCount As Integer
Dim xStr As String
Dim xArrName As Variant
Dim MR, xRg As Range
On Error Resume Next
Set MR = Range("A1:CP1")
xArrName = Array("preferredName", "*/normalizedName", "patentCitation*", "accessType*", "publicationNumber", "articleNumber", "doi", "isNumber", "showHtml", "showAlgorithm", _
"showDataset", "showVideo", "ephemera", "vj", "rightsLink", "rightslinkFlag", "pdfSize", "startPage", "endPage", "publicationDate", "htmlLink", "publisher", "showCheckbox", "redline", "handleProduct", "highlightedTitle", "publicationTitle", _
"pdfLink", "articleContentType", "isStandard", "isConference", "isJournalAndMagazine", "isEarlyAccess", "isMagazine", "isJournal", "isBook", "course", "displayContentType", "docIdentifier", "isBookWithoutChapters", "volume", "issue", "majorTopic")
'enclose each column name (or partial name with wildcard to use with Like below) with quotes and separate by comma
'alternatively, have the array contain only substrings and surround with wildcards, e.g., .Range.Replace "*" & Array(i) & "*",
xCount = MR.Count
xStr = xArrName(xFNum)
For xFFNum = xCount To 1 Step -1
Set xRg = MR(1, xFFNum)
For xFNum = 0 To UBound(xArrName)
xStr = xArrName(xFNum)
If xRg.Value Like xStr Then xRg.EntireColumn.Delete
Next xFNum
Next

'only one not working is the articleTitle isn't hyperlinking in the test sheet (which means URL won't carry over to Destination Col A either

On Error Resume Next
Set M = Range("A1:AR1") 'AR is authors/7/id column
For i = 1 To M.Count
    If Cells(1, i).Value Like "*/id" Then 'i is an author column number
    For j = 2 To Cells(Rows.Count, i).End(xlUp).Row 'last non-blank cell in column i
        celltourl = Cells(j, i).Value
        'another way: If Not IsEmpty(Cells(j, i).Value)
        If celltourl <> "" Then
        Cells(j, i).Value = "https://ieeexplore.ieee.org/author/" + celltourl 'prepend to row j value
        ActiveSheet.Hyperlinks.Add Anchor:=Cells(j, i).Offset(0, -1), Address:=Cells(j, i).Value
        End If 'for inner If celltourl
    Next j 'done with authors
    ElseIf Cells(1, i).Value Like "documentLink" Then 'i is an article link column number, ie documentLink
    For j = 2 To Cells(Rows.Count, i).End(xlUp).Row 'last non-blank cell in column i
        celltourl = Cells(j, i).Value
        If celltourl <> "" Then
        Cells(j, i).Value = "https://ieeexplore.ieee.org" + celltourl 'prepend to row j value
        ActiveSheet.Hyperlinks.Add Anchor:=Cells(j, i).Offset(0, 3), Address:=Cells(j, i).Value 'add URL to articleTitle 3 cols to the right
        End If 'for inner If celltourl
    Range("U1:U101").Copy wd.Range("A1:A101") 'was col AL, now col U; copy/paste article links to Dest tab col A
    Next j 'done with article title/link
    ElseIf Cells(1, i).Value Like "publicationLink" Then 'i is pub link column number, ie PublicationLink
    For j = 2 To Cells(Rows.Count, i).End(xlUp).Row 'last non-blank cell in column i
        celltourl = Cells(j, i).Value
        If celltourl <> "" Then
        Cells(j, i).Value = "https://ieeexplore.ieee.org" + celltourl 'prepend to row j value
        ActiveSheet.Hyperlinks.Add Anchor:=Cells(j, i).Offset(0, 5), Address:=Cells(j, i).Value 'add URL to displayPublicationTitle 5 cols to the right
        End If 'for inner If celltourl
    Next j 'done with publication title/link
    ElseIf Cells(1, i).Value Like "citationsLink" Then 'i is citations link column number, ie citationsLink
    For j = 2 To Cells(Rows.Count, i).End(xlUp).Row 'last non-blank cell in column i
        celltourl = Cells(j, i).Value
        If celltourl <> "" Then
        Cells(j, i).Value = "https://ieeexplore.ieee.org" + celltourl 'prepend to row j value
        ActiveSheet.Hyperlinks.Add Anchor:=Cells(j, i).Offset(0, -4), Address:=Cells(j, i).Value 'add URL to citationCount 4 cols to the left
        End If 'for inner If celltourl
    Next j 'done with citation count/link
    End If 'done with link columns, now copy special columns to new locations on Destination tab
Next i
'    Range("U1:U101").Copy wd.Range("A1:A101") 'as col AL, now col U; copy/paste article links to Dest tab col A
    Range("AB1:AB101").Copy wd.Range("B1:B101") 'copy/paste abstract text to Dest tab col B
    Range("T1:T101").Copy wd.Range("C1:C101") 'copy/paste citation links to Dest tab col C
    Range("S1:S101").Copy wd.Range("D1:D101") 'copy/paste pub year to Dest tab col D
    Range("AA1:AA101").Copy wd.Range("E1:E101") 'copy/paste publication links to Dest tab col E
    Range("D1:D101").Copy wd.Range("F1:F101") 'copy/paste author 0 to Dest tab
    Range("H1:H101").Copy wd.Range("G1:G101") 'copy/paste author 1 to Dest tab
    Range("L1:L101").Copy wd.Range("H1:H101") 'copy/paste author 2 to Dest tab
    Range("P1:P101").Copy wd.Range("I1:I101") 'copy/paste author 3 to Dest tab
    Range("AE1:AE101").Copy wd.Range("J1:J101") 'copy/paste author 4 to Dest tab
    Range("AI1:AI101").Copy wd.Range("K1:K101") 'copy/paste author 5 to Dest tab
    Range("AM1:AM101").Copy wd.Range("L1:L101") 'copy/paste author 6 to Dest tab
    Range("AQ1:AQ101").Copy wd.Range("M1:M101") 'copy/paste author 7 to Dest tab

wd.Activate 'bring focus to Destination worksheet
'below works, but don't need it because above just copying over certain columns
'ws.Range("A1:AQ101").Copy wd.Range("E1:AS101") 'don't need to copy over all other columns, just leave them on ws tab for referennce
'On Error Resume Next
'Dim xFNumy, xFFNumy, xCounty As Integer
'Dim xStry As String
'Dim xArrNamey As Variant
'Dim MRy, xRgy As Range
'Set MRy = Range("A1:AS1")
'xArrNamey = Array("authors/0/preferredName", "*/firstName", "*/lastName", "*/id") 'delete some columns on wd, but keep on ws
'xCounty = MRy.Count
'xStry = xArrNamey(xFNumy)
'For xFFNumy = xCounty To 1 Step -1
'Set xRgy = MRy(1, xFFNumy)
'For xFNumy = 0 To UBound(xArrNamey)
'xStry = xArrNamey(xFNumy)
'If xRgy.Value Like xStry Then xRgy.EntireColumn.Delete
'If xRgy.Value Like "*/searchablePreferredName" Then xRgy.ColumnWidth = 20
'Next xFNumy
'Next

wd.Columns("A:B").ColumnWidth = 25 'increase width in artTitle, abstract cols
wd.Columns("C").ColumnWidth = 10 'increase width in citationCount col
wd.Columns("D").ColumnWidth = 4 'decrease width in Pub year col
wd.Columns("E").ColumnWidth = 18 'increase width in Pub title col
wd.Columns("F:I").ColumnWidth = 15 'increase width in first 4 author columns
wd.Columns("J:M").ColumnWidth = 10 'increase width in last 4 author columns
Application.GoTo Reference:=Range("A2") 'go to cell A2
'replicating what Freeze Panes --> Freeze Top Row does, per https://stackoverflow.com/questions/34094445/excel-vba-freeze-pane-without-select
With ActiveWindow
    If .FreezePanes Then .FreezePanes = False
    .SplitColumn = 0
    .SplitRow = 1
    .FreezePanes = True
End With


End If 'this is for the end of entire macro
Application.GoTo Reference:=wd.Range("A2")
Application.ScreenUpdating = True
End Sub

0 Answers0