用不同字符分隔excel中的列

2024-10-02 18:25:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个excel,有几千个数据源,是这样排列的:

例1:

Abbott KW, Snidal D (2009) The Governance Triangle: Regulatory Standards Institutions and the Shadow of the State. In: Mattli W , Woods N (eds) The Politics of Global Regulation, pp. 44–88. Princeton University Press, Princeton, NJ

例2:

Moschella M , Tsingou E (eds) (2013) Great Expectations, Slow Transformations: Incremental Change in Financial Governance. ECPR Press, Colchester

我需要用这些数据将它们分成7列:

  1. 第一作者
  2. 第二作者
  3. 第三至N作者
  4. 出版年份
  5. 源文章标题
  6. 发布于(不总是包含在内,但总是以in:开头)
  7. 更多信息-指在源文章标题中/之后发布的所有内容(以防它不是较大出版物的一部分)

我试着在excel中使用分栏工具,但是由于数据的多样性,我无法有效地完成。 有人知道解决这个问题的办法吗?你知道吗


Tags: ofthe数据in标题文章作者excel
2条回答

试试这个VBA宏。它使用正则表达式来解析不同的段;但是如果数据不是您所呈现的方式,那么它将失败;因此,如果出现失败,您需要了解它是如何与我的假设或您呈现数据的方式不匹配的。你知道吗

宏假定数据从A1开始,在A列中,第1行中没有标签。结果被写入B列和后续列;带有标签行1,但这些结果可以放在任何地方。你知道吗

此代码进入常规模块。你知道吗

Option Explicit
Sub ParseBiblio()
    Dim vData As Variant
    Dim vBiblios() As Variant
    Dim rRes As Range
    Dim re As Object, mc As Object
    Dim I As Long

'Assume Data is in column A.
'Might need to start at row 2 if there is a label row
vData = Range("A1", Cells(Rows.Count, "A").End(xlUp))

'Results to start in Column B with labels in row 1
Set rRes = Range("b1")

Set re = CreateObject("vbscript.regexp")
With re
    .MultiLine = True
    .Global = True
    .ignorecase = True
    .Pattern = "(^[^,]+),?\s*([^,]+?)(?:,\s*([^(]+))?\s*\((\d{4})\)\s*(.*?\.)\s*(?:In:\s*(.*)\.)?\s*(.*)"
End With

'Results array and labels
ReDim vBiblios(1 To UBound(vData) + 1, 1 To 7)
    vBiblios(1, 1) = "First Author"
    vBiblios(1, 2) = "Second Author"
    vBiblios(1, 3) = "Other Authors"
    vBiblios(1, 4) = "Publication Year"
    vBiblios(1, 5) = "Title"
    vBiblios(1, 6) = "Published In"
    vBiblios(1, 7) = "More Info"

For I = 1 To UBound(vData)
    Set mc = re.Execute(vData(I, 1))
    If mc.Count > 0 Then
        With mc(0)
            vBiblios(I + 1, 1) = .submatches(0)
            vBiblios(I + 1, 2) = .submatches(1)
            vBiblios(I + 1, 3) = .submatches(2)
            vBiblios(I + 1, 4) = .submatches(3)
            vBiblios(I + 1, 5) = .submatches(4)
            vBiblios(I + 1, 6) = .submatches(5)
            vBiblios(I + 1, 7) = .submatches(6)
        End With
    End If
Next I

Set rRes = rRes.Resize(rowsize:=UBound(vBiblios, 1), columnsize:=UBound(vBiblios, 2))
rRes.EntireColumn.Clear
rRes = vBiblios
With rRes
    With .Rows(1)
        .Font.Bold = True
        .HorizontalAlignment = xlCenter
    End With
    .EntireColumn.AutoFit
End With

End Sub

请参阅How to split Bibiliography MLA string into BibTex using c#?,我链接了几个从格式化文本中提取书目信息的专用工具。你知道吗

相关问题 更多 >