Remove all text after a defined character
goualman
-
[Dal] Posted messages 6205 Registration date Status Contributeur Last intervention -
[Dal] Posted messages 6205 Registration date Status Contributeur Last intervention -
Hello,
I have a document of several hundred lines in the format xxx:yyy | AAA (with line breaks each time)
I would like to remove the character "I" and the text that follows this character throughout the entire document.
Thank you for your help!
I have a document of several hundred lines in the format xxx:yyy | AAA (with line breaks each time)
I would like to remove the character "I" and the text that follows this character throughout the entire document.
Thank you for your help!
3 réponses
Hi goualman,
You can use any editor that supports regular expressions (regexp) and ask it to search for
You can see this regexp in action here: https://regex101.com/r/vZaC37/1
where you can see that it replaces:
with:
This regexp looks for one or more spaces followed by a pipe character (the vertical bar | produced by AltGr-6 on our French keyboards) followed by one or more spaces, followed by anything until the end of the line. It works as long as there is only one pipe character surrounded by space(s) before or after this character on each line and does not check the format of what is before the pipe character and the space(s) that immediately precede it.
Notepad++, gvim, can do that, etc.
You can also just use the link https://regex101.com/r/vZaC37/1 to copy your hundreds of lines into the top section instead of my test data, and get the result in the bottom section without having to install one of those text editors or learn how to use them.
However, I can only recommend that you learn to use these tools and regex.
Dal
You can use any editor that supports regular expressions (regexp) and ask it to search for
\s+\|\s+.*$and replace it with nothing.
You can see this regexp in action here: https://regex101.com/r/vZaC37/1
where you can see that it replaces:
xxx:yyy | AAA GJGJDJG JHGJGJGJ xxx:yyy | AAA KJHKHDKJHK JKHKJHD xxx:yyy | AAA KKJHDJKHKJKF xxx:yyy | AAA KJHKDHKDFKJDHKJHK
with:
xxx:yyy xxx:yyy xxx:yyy xxx:yyy
This regexp looks for one or more spaces followed by a pipe character (the vertical bar | produced by AltGr-6 on our French keyboards) followed by one or more spaces, followed by anything until the end of the line. It works as long as there is only one pipe character surrounded by space(s) before or after this character on each line and does not check the format of what is before the pipe character and the space(s) that immediately precede it.
Notepad++, gvim, can do that, etc.
You can also just use the link https://regex101.com/r/vZaC37/1 to copy your hundreds of lines into the top section instead of my test data, and get the result in the bottom section without having to install one of those text editors or learn how to use them.
However, I can only recommend that you learn to use these tools and regex.
Dal
yg_be
Posted messages
23437
Registration date
Status
Contributeur
Last intervention
Ambassadeur
1 587
Hello, can you tell us more?
Is it an electronic document?
Is it an electronic document?
Everything works perfectly!!!!! Thank you very much for the time you took to help me! :)
It works perfectly!!!!! Thank you so much for the time you took to help me! :)
I've just realized that some lines have multiple pipe characters, how can I erase all the text preceding the first pipe character? (or only keep the first part before the pipe character)
For example:
xxx:yyy | AAA | GJGJDJG | JHGJGJGJ
becomes
xxx:yyy
Thank you in advance!
In fact, if there is no pipe character in the part you want to keep "xxx:yyy", the fact that there are multiple pipe characters in the rest is not problematic, because the regexp "consumes" the entire line as soon as it encounters the first pipe surrounded by one or more spaces. Everything else until the end of the line is covered by the regexp.
You can see an example with the same regexp here:
https://regex101.com/r/vZaC37/2
where:
gives:
If you have one or more pipe characters surrounded by one or more spaces in the part you want to keep, we need more information on what is in there to distinguish them and create an adapted regexp matching this first part that you want to keep intact.