import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Example {
public static void main(String[] args) {
final String regex = "\\b(\\d{1,}([ ]\\d{3})*)([,.] ?(\\d*))?\\b";
final String string = "== some english sentences ==\n"
+ "pay 123 usd\n"
+ "i want to transfer 100,1234$\n"
+ "how many USD will i get for 47,11 EUR?\n"
+ "Please advice how can I add cash 4000eur on this ban account?\n"
+ "I would like to buy in your bank 3000 euro.\n"
+ "I’ve deposited 2000$ on on the day of opening my account\n"
+ "50€ transferred to my bank account\n"
+ "Why did the bank put the 50,000 US dollars transferred to me into my euro account\n\n"
+ "== some slovak sentences ==\n"
+ "Ahoj prečo mám na účtu -165 eura ?\n"
+ "Ako uhradim na česky účet 800 CZK ?\n"
+ "Chcem spravit prevod nad limit 30000€\n"
+ "chcem vyplatiť restruktulizovane prečerpanie vo výške 270, 83 €\n"
+ "Adam, potrebujem zamenit 18.000 eur na Ceske koruny. Mozem dostat individualny kurz?\n"
+ "Chcela by som zamenit české koruny v sume 20 000 na eura\n"
+ "Chcem zamenit 10000 forintov na euro mate aj poplatok\n"
+ "chcem si na pobočke v Malackách zameniť 7800 CHF na Euro.\n"
+ "chcel by som zameniť české koruny na euro aký je kurz aake sú poplatky suma by bola 25000ceskych korún\n"
+ "Koľko je 10 000 € keď si zamenim za poľské zlote ?\n"
+ "mám 10.000 USD a chcem ich zameniť na euro\n"
+ "chce aby som zamenil 31eur na doláre\n"
+ "Ak chcem kupovat od banky franky, tak ide zo strany banky o predaj valut, a je tam kurz valuta predaj franky 0,9889\n\n"
+ "== simple numbers ==\n"
+ "100,10\n"
+ "425,90\n"
+ "50.000kc\n"
+ "62,17 eur\n"
+ "-0,13€\n"
+ "10,- eur\n"
+ "-100€\n"
+ "1000000eur\n"
+ "10134,20cz\n"
+ "4.000\n"
+ "40tisíc\n"
+ "45,000\n"
+ "9.93eur\n\n\n\n"
+ "== length corner cases: ==\n"
+ "12345678901234567890 --> up to 20 digits\n"
+ "1,12345678 --> up to 8 post-comma digits --> value = 1,1235 (rounded)?\n"
+ "1,12345678 --> up to 8 post-comma digits --> value = 1,1234 (truncated)?\n"
+ "123.123.123,123567 --> what about more than 4 decimal places: \n"
+ "--> to be confirmed\n\n"
+ "== unusual post-comma leghts: ==\n"
+ "10,1\n"
+ "10,123\n\n"
+ "== what about separators for thousand groups like: ==\n"
+ "123.123.123,00 EUR\n"
+ "123 123 123 USD\n"
+ "--> this is questionable (not usual in Slovak)\n\n"
+ "== allow dot for comma? ==\n"
+ "123.12 --> value = 123,1200\n"
+ "--> yes, as answered in Inez's questions: ‘,‘ and ‘.‘ are expected only as decimal separators in decimal numbers not as separators of hundreds/thousands etc. (not typical in written Slovak)\n\n"
+ "== things that would NOT need disambiguation: ==\n"
+ "123.456 --> to map to value 123,4560 and NOT 123456,0000 !\n"
+ "--> dot considered as decimal separator so map to 123,4560\n\n"
+ "== things that would need disambiguation: ==\n"
+ "- none -\n";
final String subst = "[$0 --> ($1),($4)]";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);
System.out.println("Substitution result: " + result);
}
}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Java, please visit: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html