英文:
Efficiently split large strings in Java
问题
我有一个大字符串,应在某个字符处进行拆分,前面没有另一个特定字符。
最有效的方法是什么?
示例:在 ':' 处拆分此字符串,但不在 "?:" 处拆分:
part1:part2:https?:example.com:anotherstring
我已经尝试过的方法:
-
正则表达式
(?<!\?):
。非常慢。 -
首先获取要拆分字符串的索引,然后进行拆分。只适用于字符串中没有多个拆分字符的情况。
-
按字符迭代字符串。在有不多的保护字符(例如 '?')的情况下效率高。
英文:
I have a large string that should be split at a certain character, if it is not preceded by another certain character.
Would is the most efficient way to do this?
An example: Split this string at ':', but not at "?:":
part1:part2:https?:example.com:anotherstring
What I have tried so far:
-
Regex
(?<!\?):
. Very slow. -
First getting the indices where to split the string and then split it. Only efficient if there are not many split characters in the string.
-
Iterating over the string character by character. Efficient if there are not many protect characters (e.g. '?').
答案1
得分: 0
int lastIndex = 0;
for (int index = string.indexOf(":"); index >= 0; index = string.indexOf(":", lastIndex)) {
if (index == 0 || string.charAt(index - 1) != '?') {
String splitString = string.substring(lastIndex, index);
// 将 splitString 添加到列表或数组
lastIndex = index + 1;
}
}
// 将 string.substring(lastIndex) 添加到列表或数组
英文:
I fear you would have to go through the string and check if a ":" is preceded by a "?"
int lastIndex=0;
for(int index=string.indexOf(":"); index >= 0; index=string.indexOf(":", lastIndex)){
if(index == 0 || string.charAt(index-1) != '?'){
String splitString = string.subString(lastIndex, index);
// add splitString to list or array
lastIndex = index+1;
}
}
// add string.subString(lastIndex) to list or array
</details>
# 答案2
**得分**: 0
你将不得不非常谨慎地测试这个(因为我没有这样做),但是在`split()`中使用正则表达式可能会产生你想要的结果:
```Java
public static void main(String[] args) {
String s = "Start.Teststring.Teststring1?.Teststring2.?Teststring3.?.End";
String[] result = s.split("(?<!\\?)\\.(?!\\.)");
System.out.println(String.join("|", result));
}
输出:
Start|Teststring|Teststring1?.Teststring2|?Teststring3|?.End
注意:
这仅在点号前面不是问号的情况下考虑了按点号分割的示例。
我认为你不太可能得到比正则表达式更高效的解决方案了...
英文:
You will have to test this very carefully (since I didn't do that), but using a regular expression in the split()
might produce the results you want:
public static void main(String[] args) {
String s = "Start.Teststring.Teststring1?.Teststring2.?Teststring3.?.End";
String[] result = s.split("(?<!\\?)\\.(?!\\.)");
System.out.println(String.join("|", result));
}
Output:
Start|Teststring|Teststring1?.Teststring2|?Teststring3|?.End
Note:
This only considers your example about splitting by dot if the dot is not preceded by an interrogation mark.
I don't think you will get a much more performant solution than the regex...
专注分享java语言的经验与见解,让所有开发者获益!
评论