admin 管理员组

文章数量: 1184232

In a project, I write xlsx files using python openpyxl.

Unfortunately, I can't find a correct way to use a regex function.

Here is my MWE:

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cell = sheet.cell(row=1, column=1)
_cell.value = "1;2;3;456;5;4;3;2;1"

_cell2 = sheet.cell(row=2, column=1)
_cell2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"@",0,0)"""

classeur.save("MWE.xlsx")

With libreoffice calc, correct formula should be =REGEX(A1;"(?<![^;])[0-9]{1}(?![^;])";"@";"g").

When opening the file (with calc), I get a Err:508

Context

The initial aim is to count exact occurences of substring. In the MWE below, you can see The "A" is counted 3 times, while I only want 1… Thus I think of regex to count only within ; delimiters.

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cA1 = sheet.cell(row=1, column=1)
_cA1.value = "1;2;3;456;5;4;3;2;1"

_cA2 = sheet.cell(row=2, column=1)
_cA2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"",0,0)"""


_cA4 = sheet.cell(row=4, column=1)
_cA4.value = "Text to count within"
_cB4 = sheet.cell(row=4, column=2)
_cB4.value = "A;Aze1;1A;Qsd;C"


_cA6 = sheet.cell(row=6, column=1)
_cA6.value = "substring to find"
_cB6 = sheet.cell(row=6, column=2)
_cB6.value = "Count"
for _i, _v in enumerate(["A","A1","1A","B","C"]):
    _c_i1 = sheet.cell(row=7+_i, column=1)
    _c_i1.value = _v
    
    _c_i2 = sheet.cell(row=7+_i, column=2)
    _c_i2.value = f"""=(LEN($B4)-LEN(SUBSTITUTE($B4,$A{7+_i},"")))/LEN($A{7+_i})"""

classeur.save("MWE.xlsx")

In a project, I write xlsx files using python openpyxl.

Unfortunately, I can't find a correct way to use a regex function.

Here is my MWE:

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cell = sheet.cell(row=1, column=1)
_cell.value = "1;2;3;456;5;4;3;2;1"

_cell2 = sheet.cell(row=2, column=1)
_cell2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"@",0,0)"""

classeur.save("MWE.xlsx")

With libreoffice calc, correct formula should be =REGEX(A1;"(?<![^;])[0-9]{1}(?![^;])";"@";"g").

When opening the file (with calc), I get a Err:508

Context

The initial aim is to count exact occurences of substring. In the MWE below, you can see The "A" is counted 3 times, while I only want 1… Thus I think of regex to count only within ; delimiters.

from openpyxl import Workbook

classeur = Workbook()
sheet = classeur.active
_cA1 = sheet.cell(row=1, column=1)
_cA1.value = "1;2;3;456;5;4;3;2;1"

_cA2 = sheet.cell(row=2, column=1)
_cA2.value = """=REGEXREPLACE(A1,"(?<![^;])[0-9]{1}(?![^;])";"",0,0)"""


_cA4 = sheet.cell(row=4, column=1)
_cA4.value = "Text to count within"
_cB4 = sheet.cell(row=4, column=2)
_cB4.value = "A;Aze1;1A;Qsd;C"


_cA6 = sheet.cell(row=6, column=1)
_cA6.value = "substring to find"
_cB6 = sheet.cell(row=6, column=2)
_cB6.value = "Count"
for _i, _v in enumerate(["A","A1","1A","B","C"]):
    _c_i1 = sheet.cell(row=7+_i, column=1)
    _c_i1.value = _v
    
    _c_i2 = sheet.cell(row=7+_i, column=2)
    _c_i2.value = f"""=(LEN($B4)-LEN(SUBSTITUTE($B4,$A{7+_i},"")))/LEN($A{7+_i})"""

classeur.save("MWE.xlsx")
Share Improve this question edited yesterday NBur asked yesterday NBurNBur 1594 silver badges12 bronze badges 5
  • Can you add a problem statement? What is the regex operation supposed to be doing? – Tim Biegeleisen Commented yesterday
  • @TimBiegeleisen see the edit. The regex replaces matches between ";" delimiters – NBur Commented yesterday
  • As noted in the documentation, neither REGEX nor REGEXREPLACE are part of the original specification so that, at the very least, these will need prefixing before use. Mind you, I wouldn't want to use anything like this within a spreadsheet: far too error prone and difficult to debug. – Charlie Clark Commented yesterday
  • @CharlieClark, if you know the prefix to use, please, tell me. BTW, many functions do need such prefix to be used with openpyxl. I tried _xlfn with no success… – NBur Commented 21 hours ago
  • If downvoter can explain what is missing, maybe I can improve the question. – NBur Commented 21 hours ago
Add a comment  | 

1 Answer 1

Reset to default 0

This is just a general answer and approach which can work here, and it doesn't require regular expressions:

value = "1;2;3;456;5;4;3;2;1"
value = ';' + value + ';'
len_orig = len(value)
len_new = len(value.replace(';1;', ''))
num_occur = (len_orig - len_new) / (2 + len('1'))
print('The number 1 occured ' + str(num_occur) + ' times.')
# The number 1 occured 2 times.

The basic approach here is to:

  • Surround the input string with ; delimiters on both sides. This ensures that every ssv value has ; on both sides.
  • Capture the length of this string.
  • Then replace ;1; with empty string.
  • Finally capture the new length, then calculate how many 1 occurred.

本文标签: Openpyxl regexreplaceStack Overflow