Document how to perform advanced string splitting using RegEx

This closes https://github.com/godotengine/godot-docs/issues/3607.

(cherry picked from commit 5f2b6bd476)
This commit is contained in:
Hugo Locurcio 2020-07-29 12:12:01 +02:00 committed by Rémi Verschelde
parent bd76fcd43b
commit 4a0568b609
2 changed files with 14 additions and 4 deletions

View File

@ -817,8 +817,8 @@
<argument index="2" name="maxsplit" type="int" default="0"> <argument index="2" name="maxsplit" type="int" default="0">
</argument> </argument>
<description> <description>
Splits the string by a [code]delimiter[/code] string and returns an array of the substrings. Splits the string by a [code]delimiter[/code] string and returns an array of the substrings. The [code]delimiter[/code] can be of any length.
If [code]maxsplit[/code] is specified, it defines the number of splits to do from the left up to [code]maxsplit[/code]. The default value of 0 means that all items are split. If [code]maxsplit[/code] is specified, it defines the number of splits to do from the left up to [code]maxsplit[/code]. The default value of [code]0[/code] means that all items are split.
Example: Example:
[codeblock] [codeblock]
var some_string = "One,Two,Three,Four" var some_string = "One,Two,Three,Four"
@ -827,6 +827,7 @@
print(some_array[0]) # Prints "One" print(some_array[0]) # Prints "One"
print(some_array[1]) # Prints "Two,Three,Four" print(some_array[1]) # Prints "Two,Three,Four"
[/codeblock] [/codeblock]
If you need to split strings with more complex rules, use the [RegEx] class instead.
</description> </description>
</method> </method>
<method name="split_floats"> <method name="split_floats">

View File

@ -11,7 +11,7 @@
regex.compile("\\w-(\\d+)") regex.compile("\\w-(\\d+)")
[/codeblock] [/codeblock]
The search pattern must be escaped first for GDScript before it is escaped for the expression. For example, [code]compile("\\d+")[/code] would be read by RegEx as [code]\d+[/code]. Similarly, [code]compile("\"(?:\\\\.|[^\"])*\"")[/code] would be read as [code]"(?:\\.|[^"])*"[/code]. The search pattern must be escaped first for GDScript before it is escaped for the expression. For example, [code]compile("\\d+")[/code] would be read by RegEx as [code]\d+[/code]. Similarly, [code]compile("\"(?:\\\\.|[^\"])*\"")[/code] would be read as [code]"(?:\\.|[^"])*"[/code].
Using [method search] you can find the pattern within the given text. If a pattern is found, [RegExMatch] is returned and you can retrieve details of the results using functions such as [method RegExMatch.get_string] and [method RegExMatch.get_start]. Using [method search], you can find the pattern within the given text. If a pattern is found, [RegExMatch] is returned and you can retrieve details of the results using methods such as [method RegExMatch.get_string] and [method RegExMatch.get_start].
[codeblock] [codeblock]
var regex = RegEx.new() var regex = RegEx.new()
regex.compile("\\w-(\\d+)") regex.compile("\\w-(\\d+)")
@ -19,7 +19,7 @@
if result: if result:
print(result.get_string()) # Would print n-0123 print(result.get_string()) # Would print n-0123
[/codeblock] [/codeblock]
The results of capturing groups [code]()[/code] can be retrieved by passing the group number to the various functions in [RegExMatch]. Group 0 is the default and will always refer to the entire pattern. In the above example, calling [code]result.get_string(1)[/code] would give you [code]0123[/code]. The results of capturing groups [code]()[/code] can be retrieved by passing the group number to the various methods in [RegExMatch]. Group 0 is the default and will always refer to the entire pattern. In the above example, calling [code]result.get_string(1)[/code] would give you [code]0123[/code].
This version of RegEx also supports named capturing groups, and the names can be used to retrieve the results. If two or more groups have the same name, the name would only refer to the first one with a match. This version of RegEx also supports named capturing groups, and the names can be used to retrieve the results. If two or more groups have the same name, the name would only refer to the first one with a match.
[codeblock] [codeblock]
var regex = RegEx.new() var regex = RegEx.new()
@ -34,6 +34,15 @@
print(result.get_string("digit")) print(result.get_string("digit"))
# Would print 01 03 0 3f 42 # Would print 01 03 0 3f 42
[/codeblock] [/codeblock]
[b]Example of splitting a string using a RegEx:[/b]
[codeblock]
var regex = RegEx.new()
regex.compile("\\S+") # Negated whitespace character class.
var results = []
for match in regex.search_all("One Two \n\tThree"):
results.push_back(match.get_string())
# The `results` array now contains "One", "Two", "Three".
[/codeblock]
[b]Note:[/b] Godot's regex implementation is based on the [url=https://www.pcre.org/]PCRE2[/url] library. You can view the full pattern reference [url=https://www.pcre.org/current/doc/html/pcre2pattern.html]here[/url]. [b]Note:[/b] Godot's regex implementation is based on the [url=https://www.pcre.org/]PCRE2[/url] library. You can view the full pattern reference [url=https://www.pcre.org/current/doc/html/pcre2pattern.html]here[/url].
[b]Tip:[/b] You can use [url=https://regexr.com/]Regexr[/url] to test regular expressions online. [b]Tip:[/b] You can use [url=https://regexr.com/]Regexr[/url] to test regular expressions online.
</description> </description>