Text file to spreadsheet
Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:
A.txt
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
B.txt
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
C.txt
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
D.txt
//empty file
E.txt
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38
The Output:
spreadsheet_to_text.py
"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""
import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
What do you think about the code?
How can it be improved?
edit: you can find a programm doing it in reversed in Spreadsheets to text files
python excel
add a comment |
Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:
A.txt
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
B.txt
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
C.txt
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
D.txt
//empty file
E.txt
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38
The Output:
spreadsheet_to_text.py
"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""
import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
What do you think about the code?
How can it be improved?
edit: you can find a programm doing it in reversed in Spreadsheets to text files
python excel
2
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18
add a comment |
Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:
A.txt
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
B.txt
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
C.txt
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
D.txt
//empty file
E.txt
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38
The Output:
spreadsheet_to_text.py
"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""
import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
What do you think about the code?
How can it be improved?
edit: you can find a programm doing it in reversed in Spreadsheets to text files
python excel
Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:
A.txt
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22
B.txt
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
C.txt
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33
D.txt
//empty file
E.txt
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38
The Output:
spreadsheet_to_text.py
"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""
import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
What do you think about the code?
How can it be improved?
edit: you can find a programm doing it in reversed in Spreadsheets to text files
python excel
python excel
edited Dec 19 '18 at 19:43
asked Dec 18 '18 at 20:10
Sandro4912
924121
924121
2
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18
add a comment |
2
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18
2
2
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18
add a comment |
2 Answers
2
active
oldest
votes
There are a few things we could improve:
you could use the "lazy"
glob.iglob()
to filter out*.txt
files instead of doing theos.listdir()
, keeping the whole list if memory and having an extra check inside the loop:
for filename in glob.iglob("*.txt"):
instead of using
textfile.readlines()
and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:
for line in textfile:
instead of manually keeping track of
column
value, you could useenumerate()
:
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
Same idea could be applied for rows.
I think you don't have to use
get_column_letter()
and instead operate the numbers which you have:
sheet.cell(row=row, column=column).value = line
not to say anything against
openpyxl
, but I personally findxlsxwriter
module's API more enjoyable and more feature rich
Complete improved version:
import glob
import openpyxl
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename
for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
add a comment |
If you're on a Linux/Mac machine, it's as simple as this shell command:
paste ?.txt
The ?
wildcard will match all your files, A.txt
to E.txt
in order. The paste
command will paste them in parallel columns, separated by TABs.
You can then open your spreadsheet app and import the text file, and add the header.
Per a question formerly in comments: Can you auto-generate the header as well?
Sure:
for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt
Also, I'm assuming a single letter before .txt
, as in the original example. If you want all files ending in .txt
, then it's *.txt
instead of ?.txt
.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209933%2ftext-file-to-spreadsheet%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are a few things we could improve:
you could use the "lazy"
glob.iglob()
to filter out*.txt
files instead of doing theos.listdir()
, keeping the whole list if memory and having an extra check inside the loop:
for filename in glob.iglob("*.txt"):
instead of using
textfile.readlines()
and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:
for line in textfile:
instead of manually keeping track of
column
value, you could useenumerate()
:
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
Same idea could be applied for rows.
I think you don't have to use
get_column_letter()
and instead operate the numbers which you have:
sheet.cell(row=row, column=column).value = line
not to say anything against
openpyxl
, but I personally findxlsxwriter
module's API more enjoyable and more feature rich
Complete improved version:
import glob
import openpyxl
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename
for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
add a comment |
There are a few things we could improve:
you could use the "lazy"
glob.iglob()
to filter out*.txt
files instead of doing theos.listdir()
, keeping the whole list if memory and having an extra check inside the loop:
for filename in glob.iglob("*.txt"):
instead of using
textfile.readlines()
and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:
for line in textfile:
instead of manually keeping track of
column
value, you could useenumerate()
:
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
Same idea could be applied for rows.
I think you don't have to use
get_column_letter()
and instead operate the numbers which you have:
sheet.cell(row=row, column=column).value = line
not to say anything against
openpyxl
, but I personally findxlsxwriter
module's API more enjoyable and more feature rich
Complete improved version:
import glob
import openpyxl
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename
for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
add a comment |
There are a few things we could improve:
you could use the "lazy"
glob.iglob()
to filter out*.txt
files instead of doing theos.listdir()
, keeping the whole list if memory and having an extra check inside the loop:
for filename in glob.iglob("*.txt"):
instead of using
textfile.readlines()
and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:
for line in textfile:
instead of manually keeping track of
column
value, you could useenumerate()
:
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
Same idea could be applied for rows.
I think you don't have to use
get_column_letter()
and instead operate the numbers which you have:
sheet.cell(row=row, column=column).value = line
not to say anything against
openpyxl
, but I personally findxlsxwriter
module's API more enjoyable and more feature rich
Complete improved version:
import glob
import openpyxl
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename
for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
There are a few things we could improve:
you could use the "lazy"
glob.iglob()
to filter out*.txt
files instead of doing theos.listdir()
, keeping the whole list if memory and having an extra check inside the loop:
for filename in glob.iglob("*.txt"):
instead of using
textfile.readlines()
and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:
for line in textfile:
instead of manually keeping track of
column
value, you could useenumerate()
:
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
Same idea could be applied for rows.
I think you don't have to use
get_column_letter()
and instead operate the numbers which you have:
sheet.cell(row=row, column=column).value = line
not to say anything against
openpyxl
, but I personally findxlsxwriter
module's API more enjoyable and more feature rich
Complete improved version:
import glob
import openpyxl
def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename
for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line
workbook.save('result.xlsx')
if __name__ == "__main__":
text_into_spreadsheet()
edited Dec 18 '18 at 23:02
answered Dec 18 '18 at 22:44
alecxe
14.9k53478
14.9k53478
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
add a comment |
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
I think this solution is very very nice. It is alot better to read compared to what i did.
– Sandro4912
Dec 19 '18 at 18:17
add a comment |
If you're on a Linux/Mac machine, it's as simple as this shell command:
paste ?.txt
The ?
wildcard will match all your files, A.txt
to E.txt
in order. The paste
command will paste them in parallel columns, separated by TABs.
You can then open your spreadsheet app and import the text file, and add the header.
Per a question formerly in comments: Can you auto-generate the header as well?
Sure:
for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt
Also, I'm assuming a single letter before .txt
, as in the original example. If you want all files ending in .txt
, then it's *.txt
instead of ?.txt
.
add a comment |
If you're on a Linux/Mac machine, it's as simple as this shell command:
paste ?.txt
The ?
wildcard will match all your files, A.txt
to E.txt
in order. The paste
command will paste them in parallel columns, separated by TABs.
You can then open your spreadsheet app and import the text file, and add the header.
Per a question formerly in comments: Can you auto-generate the header as well?
Sure:
for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt
Also, I'm assuming a single letter before .txt
, as in the original example. If you want all files ending in .txt
, then it's *.txt
instead of ?.txt
.
add a comment |
If you're on a Linux/Mac machine, it's as simple as this shell command:
paste ?.txt
The ?
wildcard will match all your files, A.txt
to E.txt
in order. The paste
command will paste them in parallel columns, separated by TABs.
You can then open your spreadsheet app and import the text file, and add the header.
Per a question formerly in comments: Can you auto-generate the header as well?
Sure:
for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt
Also, I'm assuming a single letter before .txt
, as in the original example. If you want all files ending in .txt
, then it's *.txt
instead of ?.txt
.
If you're on a Linux/Mac machine, it's as simple as this shell command:
paste ?.txt
The ?
wildcard will match all your files, A.txt
to E.txt
in order. The paste
command will paste them in parallel columns, separated by TABs.
You can then open your spreadsheet app and import the text file, and add the header.
Per a question formerly in comments: Can you auto-generate the header as well?
Sure:
for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt
Also, I'm assuming a single letter before .txt
, as in the original example. If you want all files ending in .txt
, then it's *.txt
instead of ?.txt
.
edited Dec 20 '18 at 0:09
Sᴀᴍ Onᴇᴌᴀ
8,35261853
8,35261853
answered Dec 19 '18 at 16:13
LyleK
412
412
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209933%2ftext-file-to-spreadsheet%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18