[python] Search for a string in a file
phic
-
ccadic -
ccadic -
Hello.
I'm learning Python and I would like to write a small script that opens a file and searches for strings in it (which will be lines of code in another language) and then displays them with a print statement afterwards. How should I go about it?
Thank you in advance.
I'm learning Python and I would like to write a small script that opens a file and searches for strings in it (which will be lines of code in another language) and then displays them with a print statement afterwards. How should I go about it?
Thank you in advance.
19 réponses
Hello !
Here is a simple example:
We are looking for lines containing "coucou" in the file fichier.txt and displaying them:
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Here is a simple example:
We are looking for lines containing "coucou" in the file fichier.txt and displaying them:
#!/usr/bin/python # -*- coding: iso-8859-1 -*- chaine = "coucou" # Text to search file = open("fichier.txt","r") for ligne in fichier: if chaine in ligne: print ligne file.close() --
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Hello,
I don't know Python yet.
However, I think I can answer you about
what are chaine and ligne declared, but what do they correspond to.
Actually, chaines is not declared either, but rather initialized.
The variables ligne and chaine come into existence as soon as they are used, without needing a declaration like we do in C, for example.
Sebsauvage uses ligne and chaine for coherence in what he has done, but you can use anything (just keep some consistency between what you want to achieve and the variable names you use :-))
Example:
--
lami20j
I don't know Python yet.
However, I think I can answer you about
what are chaine and ligne declared, but what do they correspond to.
Actually, chaines is not declared either, but rather initialized.
The variables ligne and chaine come into existence as soon as they are used, without needing a declaration like we do in C, for example.
Sebsauvage uses ligne and chaine for coherence in what he has done, but you can use anything (just keep some consistency between what you want to achieve and the variable names you use :-))
Example:
fichier = open("fichier.txt","r") for line in fichier: for mot in chaines: if mot in line: print line fichier.close() --
lami20j
There is no \n in your code
Yes, here:
o.write('%s\n' % i.replace(entr1,entr2))
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Yes, here:
o.write('%s\n' % i.replace(entr1,entr2))
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
And if I want to search for multiple strings and display them line by line. I need to create a class, right?
Not necessarily.
It's up to you to choose whether you want to program in classes or not.
But if there are more than 30 types of strings to search for, it might take a long time.
We could do it like this:
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Not necessarily.
It's up to you to choose whether you want to program in classes or not.
But if there are more than 30 types of strings to search for, it might take a long time.
We could do it like this:
#!/usr/bin/python # -*- coding: iso-8859-1 -*- strings = ["hello1", "hello2", "hello3"] # Text to search file = open("file.txt","r") for line in file: for string in strings: if string in line: print line file.close() --
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Hello
I just read your code and two questions come to mind.
#!/usr/bin/python
# -*- coding: iso-8859-1 -*-
strings = ["hello1",
"hello2",
"hello3"] # Text to search for
file = open("file.txt","r")
for line in file:
for string in strings:
if string in line:
print line
file.close()
strings is declared, but what do string and line correspond to?
And if you want to replace hello with goodbye in your text file, how would you do that?
I am asking you these two questions because I have the same problem right now.
Here's my code.
print "start" # start of the procedure
from os import chdir
chdir("/Volumes/GERTEX/_test/")
import shutil, string, re
input_file = open("taglist.xml","r") # Read from the taglist file the last 60 lines
lines = input_file.readlines()[-59:]
input_file.close()
output_file = open("temp_taglist.xml","w") # Copy to the temp_taglist file the last 60 lines
output_file.write("".join(lines))
s = '>442<'
re.sub(r'\s','>444<',s)
output_file.close()
shutil.copyfile('temp_taglist.xml','new_taglist.xml') # Copy from taglist.xml to new_taglist.xml
print "end" # end of the procedure
The principle is as follows: I copy the last 60 lines of a file that I rewrite into a temp file. In this file, I replace the character string 442 with 444. Once formatted, I wish to copy the entirety of this file back to my original file as the penultimate line.
If you can help me ... thank you in advance.
I just read your code and two questions come to mind.
#!/usr/bin/python
# -*- coding: iso-8859-1 -*-
strings = ["hello1",
"hello2",
"hello3"] # Text to search for
file = open("file.txt","r")
for line in file:
for string in strings:
if string in line:
print line
file.close()
strings is declared, but what do string and line correspond to?
And if you want to replace hello with goodbye in your text file, how would you do that?
I am asking you these two questions because I have the same problem right now.
Here's my code.
print "start" # start of the procedure
from os import chdir
chdir("/Volumes/GERTEX/_test/")
import shutil, string, re
input_file = open("taglist.xml","r") # Read from the taglist file the last 60 lines
lines = input_file.readlines()[-59:]
input_file.close()
output_file = open("temp_taglist.xml","w") # Copy to the temp_taglist file the last 60 lines
output_file.write("".join(lines))
s = '>442<'
re.sub(r'\s','>444<',s)
output_file.close()
shutil.copyfile('temp_taglist.xml','new_taglist.xml') # Copy from taglist.xml to new_taglist.xml
print "end" # end of the procedure
The principle is as follows: I copy the last 60 lines of a file that I rewrite into a temp file. In this file, I replace the character string 442 with 444. Once formatted, I wish to copy the entirety of this file back to my original file as the penultimate line.
If you can help me ... thank you in advance.
Re,
I forgot the spaces. I can't edit my message
--
lami20j
I forgot the spaces. I can't edit my message
file = open("file.txt","r") for line in file: for word in chains: if word in line: print line file.close() --
lami20j
Here it is, exactly.
It's the for loop that will create the variables chaine and ligne and will assign each of the values from fichier and chaines (respectively) one after the other.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
It's the for loop that will create the variables chaine and ligne and will assign each of the values from fichier and chaines (respectively) one after the other.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Hello
I have a little problem with searching for a string.
I followed your loop, but now I would like to implement it in a
(entr1, entr2) = (StringVar(),StringVar())
def toto():
s = open('/Volumes/GERTEX/_test/tampon_taglist.txt','r')
o = open('/Volumes/GERTEX/_test/tampon1_taglist.txt','w')
for i in s.readlines():
o.write('%s\n' % i.replace(entr1,entr2))
s.close()
o.close()
bou1 = Button(fen1, text='Validate', command = toto)
And there, when I compile, it stops on the line:
o.write('%s\n' % i.replace(entr1,entr2))
I think it comes from the declaration of the variables, but I don't understand the problem.
Here is the complete script:
from Tkinter import *
from os import chdir
chdir("/Volumes/GERTEX/_test/")
#chdir("/_test/")
import shutil, string, re
obfic = open("taglist.txt","r")# Reading from the taglist file the last 60 lines
lignes = obfic.readlines()[-59:]
obfic.close()
obfic = open("tampon_taglist.txt","w") # Copying to the tampon_taglist file the last 60 lines
obfic.write("".join(lignes))
obfic.close()
# Main program
fen1 = Tk()
fen1.title("Calculating values of TAG files")
fen1.geometry('500x300')
#(franc1,franc2,franc3,franc4,franc5,franc6,euro) = (StringVar(),StringVar(),StringVar(),StringVar(),StringVar(),StringVar(),StringVar())
(entr1, entr2) = (StringVar(),StringVar())
def seteuro():
s = open('/Volumes/GERTEX/_test/tampon_taglist.txt','r')
o = open('/Volumes/GERTEX/_test/tampon1_taglist.txt','w')
for i in s.readlines():
# o.write('%s\n' % i.replace(franc1,franc2)),('%s\n' % i.replace(franc3,franc4)),('%s\n' % i.replace(franc5,franc6))
o.write('%s\n' % i.replace(entr1,entr2))
s.close()
o.close()
# Listing of objects
txt1 = Label(fen1, text = 'Old ID1:')
txt2 = Label(fen1, text = 'New ID1:')
txt3 = Label(fen1, text = 'Old ID2:')
txt4 = Label(fen1, text = 'New ID2:')
txt5 = Label(fen1, text = 'Old mnemonic:')
txt6 = Label(fen1, text = 'New mnemonic:')
#txt7 = Label(fen1, text = '')
bou1 = Button(fen1, text='Validate', command = seteuro)
bou2 = Button(fen1, text='Quit', command = fen1.destroy)
entr1 = Entry(fen1)
entr2 = Entry(fen1)
entr3 = Entry(fen1)
entr4 = Entry(fen1)
entr5 = Entry(fen1)
entr6 = Entry(fen1)
#entr7 = Entry(fen1)
# Layout
txt1.grid(row =0)
txt2.grid(row =1)
txt3.grid(row =2)
txt4.grid(row =3)
txt5.grid(row =4)
txt6.grid(row =5)
#txt7.grid(row =5)
bou1.grid(row =10 ,column =1)
bou2.grid(row =10 ,column =2)
entr1.grid(row =0,column =1)
entr2.grid(row =1,column =1)
entr3.grid(row =2,column =1)
entr4.grid(row =3,column =1)
entr5.grid(row =4,column =1)
entr6.grid(row =5,column =1)
#entr7.grid(row =6,column =1)
# Start-up
fen1.mainloop()
I have a little problem with searching for a string.
I followed your loop, but now I would like to implement it in a
(entr1, entr2) = (StringVar(),StringVar())
def toto():
s = open('/Volumes/GERTEX/_test/tampon_taglist.txt','r')
o = open('/Volumes/GERTEX/_test/tampon1_taglist.txt','w')
for i in s.readlines():
o.write('%s\n' % i.replace(entr1,entr2))
s.close()
o.close()
bou1 = Button(fen1, text='Validate', command = toto)
And there, when I compile, it stops on the line:
o.write('%s\n' % i.replace(entr1,entr2))
I think it comes from the declaration of the variables, but I don't understand the problem.
Here is the complete script:
from Tkinter import *
from os import chdir
chdir("/Volumes/GERTEX/_test/")
#chdir("/_test/")
import shutil, string, re
obfic = open("taglist.txt","r")# Reading from the taglist file the last 60 lines
lignes = obfic.readlines()[-59:]
obfic.close()
obfic = open("tampon_taglist.txt","w") # Copying to the tampon_taglist file the last 60 lines
obfic.write("".join(lignes))
obfic.close()
# Main program
fen1 = Tk()
fen1.title("Calculating values of TAG files")
fen1.geometry('500x300')
#(franc1,franc2,franc3,franc4,franc5,franc6,euro) = (StringVar(),StringVar(),StringVar(),StringVar(),StringVar(),StringVar(),StringVar())
(entr1, entr2) = (StringVar(),StringVar())
def seteuro():
s = open('/Volumes/GERTEX/_test/tampon_taglist.txt','r')
o = open('/Volumes/GERTEX/_test/tampon1_taglist.txt','w')
for i in s.readlines():
# o.write('%s\n' % i.replace(franc1,franc2)),('%s\n' % i.replace(franc3,franc4)),('%s\n' % i.replace(franc5,franc6))
o.write('%s\n' % i.replace(entr1,entr2))
s.close()
o.close()
# Listing of objects
txt1 = Label(fen1, text = 'Old ID1:')
txt2 = Label(fen1, text = 'New ID1:')
txt3 = Label(fen1, text = 'Old ID2:')
txt4 = Label(fen1, text = 'New ID2:')
txt5 = Label(fen1, text = 'Old mnemonic:')
txt6 = Label(fen1, text = 'New mnemonic:')
#txt7 = Label(fen1, text = '')
bou1 = Button(fen1, text='Validate', command = seteuro)
bou2 = Button(fen1, text='Quit', command = fen1.destroy)
entr1 = Entry(fen1)
entr2 = Entry(fen1)
entr3 = Entry(fen1)
entr4 = Entry(fen1)
entr5 = Entry(fen1)
entr6 = Entry(fen1)
#entr7 = Entry(fen1)
# Layout
txt1.grid(row =0)
txt2.grid(row =1)
txt3.grid(row =2)
txt4.grid(row =3)
txt5.grid(row =4)
txt6.grid(row =5)
#txt7.grid(row =5)
bou1.grid(row =10 ,column =1)
bou2.grid(row =10 ,column =2)
entr1.grid(row =0,column =1)
entr2.grid(row =1,column =1)
entr3.grid(row =2,column =1)
entr4.grid(row =3,column =1)
entr5.grid(row =4,column =1)
entr6.grid(row =5,column =1)
#entr7.grid(row =6,column =1)
# Start-up
fen1.mainloop()
I think it comes from the declaration of the variables but I don't understand the problem.
.replace() cannot work with tkinter variables.
.replace() expects standard Python strings, not tkinter variables (StringVar() is a tkinter variable).
So you should write:
for example, by putting:
StringVar() is only useful for reading or writing a value in a tkinter graphical interface.
It is unnecessary for Python programs themselves.
PS: Use the "code" button to place your source code inside.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
.replace() cannot work with tkinter variables.
.replace() expects standard Python strings, not tkinter variables (StringVar() is a tkinter variable).
So you should write:
def toto(entr1,entr2): s = open('/Volumes/GERTEX/_test/tampon_taglist.txt','r') o = open('/Volumes/GERTEX/_test/tampon1_taglist.txt','w') for i in s.readlines(): o.write('%s\n' % i.replace(entr1,entr2)) s.close() o.close() for example, by putting:
entr1='coucou' entr2='kiki' toto(entr1,entr2)
StringVar() is only useful for reading or writing a value in a tkinter graphical interface.
It is unnecessary for Python programs themselves.
PS: Use the "code" button to place your source code inside.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Sle seb
Thank you for all your information, because it works like a charm.
But I have two problems that I need to identify.
The first is that in my txt file that I'm using, I have spaces between each line but I don't see which parameter is causing this.
The second is that after creating my executable with py2exe, a command prompt window opens every time and I would like to get rid of it.
And just so you know, I'm looking to integrate a sort of file explorer to browse for my files; can you help me out?
Thanks again.
Thank you for all your information, because it works like a charm.
But I have two problems that I need to identify.
The first is that in my txt file that I'm using, I have spaces between each line but I don't see which parameter is causing this.
The second is that after creating my executable with py2exe, a command prompt window opens every time and I would like to get rid of it.
And just so you know, I'm looking to integrate a sort of file explorer to browse for my files; can you help me out?
Thanks again.
You need to read this:
https://sebsauvage.net/python/gui/index_fr.html
This will explain how to read a value from a tkinter field and retrieve the string.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
https://sebsauvage.net/python/gui/index_fr.html
This will explain how to read a value from a tkinter field and retrieve the string.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Here, I tried your little example sebsauvage, and I have a question, how come there is an automatic line break for each string found?
hi
hi2
hi3
I suppose everything is handled in 'line'. Sorry for this uninteresting question :p
hi
hi2
hi3
I suppose everything is handled in 'line'. Sorry for this uninteresting question :p
I don't understand, there are no \n in your code, and in the file fichier.txt there is just:
coucou1
coucou2
coucou3
I don't see where you add the \n. It's a mystery to me :p
thank you
coucou1
coucou2
coucou3
I don't see where you add the \n. It's a mystery to me :p
thank you
Oh okay, I was actually talking about this one:
which gives:
ruga@ubuntu:~/Desktop/docs/python$ cat file.txt
hello1
hello2
hello3
ruga@ubuntu:~/Desktop/docs/python$ python test.py
hello1
hello2
hello3
ruga@ubuntu:~/Desktop/docs/python$
Isn't that an automatic line break?
#!/usr/bin/python # -*- coding: iso-8859-1 -*- chains = ["hello1", "hello2", "hello3"] # Text to search file = open("file.txt", "r") for line in file: for chain in chains: if chain in line: print line file.close() which gives:
ruga@ubuntu:~/Desktop/docs/python$ cat file.txt
hello1
hello2
hello3
ruga@ubuntu:~/Desktop/docs/python$ python test.py
hello1
hello2
hello3
ruga@ubuntu:~/Desktop/docs/python$
Isn't that an automatic line break?
How can I write to the end of my file without overwriting what already exists?
'a' for append.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
f = open('monfichier.txt','a') 'a' for append.
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Thank you, it's working perfectly.
I have a small issue left.
This syntax works, but I want to add other variables, but the replacement only occurs on the first pair.
Do you have an idea?
o.write('%s\n' % i.replace(entr1.get(), entr2.get()))
o.write('%s\n' % i.replace(entr1.get(), entr2.get()),
'%s\n' % i.replace(entr3.get(), entr4.get()))
Thanks in advance.
I have a small issue left.
This syntax works, but I want to add other variables, but the replacement only occurs on the first pair.
Do you have an idea?
o.write('%s\n' % i.replace(entr1.get(), entr2.get()))
o.write('%s\n' % i.replace(entr1.get(), entr2.get()),
'%s\n' % i.replace(entr3.get(), entr4.get()))
Thanks in advance.
>>> a='hello' >>> print a.replace('e','*').replace('l','L') h*LLo >>> --
“La vie est courte - Vous avez besoin de Python” -- Bruce Eckel, membre du comité ANSI C++
Hello everyone!
I have a question. Is it possible to search for several specific strings in a file knowing that this string is not a "hardcoded" string? It would be best to have an example:
TheString = "good day|evening"
When the script searches for TheString in the file, is it possible to make it return one of the two (either "good day" or "good evening", depending on which one is in the file)?
But of course, the '|' symbol doesn’t work here. So actually, I'm looking for some sort of operator used to make an OR (I understand that "or" has a different use).
Another thing, when it comes to an empty space, for example, accepting all strings like:
TheString = "I need * otherwise it won't work" (sorry for the lame example)
the '*' is the placeholder that can be anything. For example: 'Do this' or 'Eat that'
That is to say I want to find all strings that start with "I need" [with anything in between] and end with "otherwise it won't work". The goal is to search for all strings like:
"I need to eat a lot otherwise it won't work", "I need to make my bed every morning otherwise it won't work", "I need to drink coffee in the morning otherwise it won't work". Anyway, you get the idea :]
Hoping my questions are understandable...
Thanks in advance! :]
I have a question. Is it possible to search for several specific strings in a file knowing that this string is not a "hardcoded" string? It would be best to have an example:
TheString = "good day|evening"
When the script searches for TheString in the file, is it possible to make it return one of the two (either "good day" or "good evening", depending on which one is in the file)?
But of course, the '|' symbol doesn’t work here. So actually, I'm looking for some sort of operator used to make an OR (I understand that "or" has a different use).
Another thing, when it comes to an empty space, for example, accepting all strings like:
TheString = "I need * otherwise it won't work" (sorry for the lame example)
the '*' is the placeholder that can be anything. For example: 'Do this' or 'Eat that'
That is to say I want to find all strings that start with "I need" [with anything in between] and end with "otherwise it won't work". The goal is to search for all strings like:
"I need to eat a lot otherwise it won't work", "I need to make my bed every morning otherwise it won't work", "I need to drink coffee in the morning otherwise it won't work". Anyway, you get the idea :]
Hoping my questions are understandable...
Thanks in advance! :]
I see what you want to do.
You can use regular expressions (the re module)
For the first:
For the second:
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
You can use regular expressions (the re module)
For the first:
'good "day"|"evening"'(syntax to check, I haven't tested it)
For the second:
"You need to (.+?) or it won't work"
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
The first is that in my txt file, which I am using, I have spaces between each line but I don't see which parameter this comes from.
Are you working under Unix/Linux?
Actually, there can be 3 different types of line endings for a text file: MS-Dos mode, Unix mode, and Macintosh mode.
MS-Dos mode uses 2 characters (\r\n) which can sometimes cause two line breaks.
It's hard to say: I have neither the program nor the file to process.
The second is that after creating my executable with py2exe a command prompt window opens every time and I would like to make it disappear.
In your py2exe script, instead of doing: console=["monprogramme.py"]
do: windows=["monprogramme.py"]
And just so you know, I'm looking to integrate some sort of explorer to browse for my files, can you help me with that?
If you are using tkinter for the graphical interface, there are ready-made dialog boxes for file or directory selection.
Examples: https://www.sebsauvage.net/python/snyppets/#tkinter_dialogs
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Are you working under Unix/Linux?
Actually, there can be 3 different types of line endings for a text file: MS-Dos mode, Unix mode, and Macintosh mode.
MS-Dos mode uses 2 characters (\r\n) which can sometimes cause two line breaks.
It's hard to say: I have neither the program nor the file to process.
The second is that after creating my executable with py2exe a command prompt window opens every time and I would like to make it disappear.
In your py2exe script, instead of doing: console=["monprogramme.py"]
do: windows=["monprogramme.py"]
And just so you know, I'm looking to integrate some sort of explorer to browse for my files, can you help me with that?
If you are using tkinter for the graphical interface, there are ready-made dialog boxes for file or directory selection.
Examples: https://www.sebsauvage.net/python/snyppets/#tkinter_dialogs
--
“Life is short - You need Python” -- Bruce Eckel, member of the ANSI C++ committee
Hello,
I wanted to know if it is possible to do the same thing but on the entire text, not on each line of the text. That is to say, a global search for the word throughout the text, without using a loop that searches line by line.
Thank you!
I wanted to know if it is possible to do the same thing but on the entire text, not on each line of the text. That is to say, a global search for the word throughout the text, without using a loop that searches line by line.
Thank you!
Do you have any ideas?
Thank you for your help.