Aug. 24, 2023, 9:53 a.m., by Lem
Yesterday I heard about Microsoft announcement about Python in Excel, which is quite a big deal at several levels. I have some questions, like is it possible to run Excel with a custom version of Python or is it necessarily on Microsoft servers.
But this article is not about it, the news just reignited my ancestral curiosity for spreadsheets.
Today we talk about regex in spreadsheets, since textual search or manipulation is frequent in table-shaped databases.
Regex (for rational expressions) are a powerful tool to find/extract/replace specific parts of a text, when some parts may vary. It can be hard to understand and in some cases make the system hard to maintain, but it is so powerful that many of us can not work without it.
I learned that regex can be used in dedicated functions of main spreadsheet softwares:
It can be handy for data extraction like finding a step number at a specific location in a text (example for LibreOffice):
Step 42: 45 turns
=REGEX(H21;"Step ([:digit:]+).*";"$1")
The formula output is 42
.
No extensive explanation of Regex conventions will follow, just a quick explanation of the formula:
H21
, the text to use as reference"Step ([:digit:]+).*"
is the regex in itself, here we want to find a number ([:digit:]+
in the LibreOffice convention, but \d+
in most other conventions, the +
stands for "one or more") which is set between parentheses to define it as captured group, we also include in the searched text the following characters since we want to replace them too"$1"
is the convention name for the first captured groupFor learning by doing you can experiment on regex101 web app, which gives extensive description of any regex and allows testing on a given text
May this help your big-throughput data pipelines or small organization phone number management. You can find more examples on the dedicated page on each spreadsheet software.
I would be curious to hear about your own use cases in the comments!
Soon the regex golf in Calc?
Source: https://xkcd.com/1313/