Text file to spreadsheet












9














Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:



A.txt



A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22


B.txt



B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14


C.txt



C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33


D.txt



//empty file


E.txt



E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38


The Output:



enter image description here



spreadsheet_to_text.py



"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""

import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter

def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')

if __name__ == "__main__":
text_into_spreadsheet()


What do you think about the code?
How can it be improved?



edit: you can find a programm doing it in reversed in Spreadsheets to text files










share|improve this question




















  • 2




    Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
    – Reinderien
    Dec 18 '18 at 23:18
















9














Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:



A.txt



A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22


B.txt



B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14


C.txt



C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33


D.txt



//empty file


E.txt



E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38


The Output:



enter image description here



spreadsheet_to_text.py



"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""

import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter

def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')

if __name__ == "__main__":
text_into_spreadsheet()


What do you think about the code?
How can it be improved?



edit: you can find a programm doing it in reversed in Spreadsheets to text files










share|improve this question




















  • 2




    Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
    – Reinderien
    Dec 18 '18 at 23:18














9












9








9


3





Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:



A.txt



A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22


B.txt



B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14


C.txt



C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33


D.txt



//empty file


E.txt



E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38


The Output:



enter image description here



spreadsheet_to_text.py



"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""

import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter

def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')

if __name__ == "__main__":
text_into_spreadsheet()


What do you think about the code?
How can it be improved?



edit: you can find a programm doing it in reversed in Spreadsheets to text files










share|improve this question















Lets say you have several .txt files in a folder and want to merge all text files into a single Excel document:



A.txt



A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
A16
A17
A18
A19
A20
A21
A22


B.txt



B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14


C.txt



C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33


D.txt



//empty file


E.txt



E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
E21
E22
E23
E24
E25
E26
E27
E28
E29
E30
E31
E32
E33
E34
E35
E36
E37
E38


The Output:



enter image description here



spreadsheet_to_text.py



"""
Reads all .txt files in path of the script into a single
spreadsheet. In the first line of the spreadsheet the filename were
the data is from is displayed. Then the data follows
"""

import os
from typing import List
import openpyxl
from openpyxl.utils import get_column_letter

def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active
column: int = 1
filenames: List[str] = os.listdir()
for filename in filenames:
if filename.endswith(".txt"):
with open(filename) as textfile:
lines: List[int] = textfile.readlines()
sheet[get_column_letter(column) + '1'] = filename
row: int = 2
for line in lines:
sheet[get_column_letter(column) + str(row)] = line
row += 1
column += 1
workbook.save('result.xlsx')

if __name__ == "__main__":
text_into_spreadsheet()


What do you think about the code?
How can it be improved?



edit: you can find a programm doing it in reversed in Spreadsheets to text files







python excel






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 19 '18 at 19:43

























asked Dec 18 '18 at 20:10









Sandro4912

924121




924121








  • 2




    Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
    – Reinderien
    Dec 18 '18 at 23:18














  • 2




    Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
    – Reinderien
    Dec 18 '18 at 23:18








2




2




Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18




Does it have to be an Excel spreadsheet, or can it simply be importable by Excel? If the latter, you really should output to something like CSV, which will be faster and simpler.
– Reinderien
Dec 18 '18 at 23:18










2 Answers
2






active

oldest

votes


















11














There are a few things we could improve:





  • you could use the "lazy" glob.iglob() to filter out *.txt files instead of doing the os.listdir(), keeping the whole list if memory and having an extra check inside the loop:



    for filename in glob.iglob("*.txt"):



  • instead of using textfile.readlines() and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:



    for line in textfile:



  • instead of manually keeping track of column value, you could use enumerate():



    for column, filename in enumerate(glob.iglob("*.txt"), start=1):


    Same idea could be applied for rows.




  • I think you don't have to use get_column_letter() and instead operate the numbers which you have:



    sheet.cell(row=row, column=column).value = line


  • not to say anything against openpyxl, but I personally find xlsxwriter module's API more enjoyable and more feature rich





Complete improved version:



import glob

import openpyxl


def text_into_spreadsheet():
"""main logic for read .txt into spreadsheet"""
workbook = openpyxl.Workbook()
sheet = workbook.active

for column, filename in enumerate(glob.iglob("*.txt"), start=1):
with open(filename) as textfile:
sheet.cell(row=1, column=column).value = filename

for row, line in enumerate(textfile, start=2):
sheet.cell(row=row, column=column).value = line

workbook.save('result.xlsx')


if __name__ == "__main__":
text_into_spreadsheet()





share|improve this answer























  • I think this solution is very very nice. It is alot better to read compared to what i did.
    – Sandro4912
    Dec 19 '18 at 18:17



















4














If you're on a Linux/Mac machine, it's as simple as this shell command:



paste ?.txt


The ? wildcard will match all your files, A.txt to E.txt in order. The paste command will paste them in parallel columns, separated by TABs.



You can then open your spreadsheet app and import the text file, and add the header.



Per a question formerly in comments: Can you auto-generate the header as well?
Sure:



for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt


Also, I'm assuming a single letter before .txt, as in the original example. If you want all files ending in .txt, then it's *.txt instead of ?.txt.






share|improve this answer























    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209933%2ftext-file-to-spreadsheet%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    11














    There are a few things we could improve:





    • you could use the "lazy" glob.iglob() to filter out *.txt files instead of doing the os.listdir(), keeping the whole list if memory and having an extra check inside the loop:



      for filename in glob.iglob("*.txt"):



    • instead of using textfile.readlines() and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:



      for line in textfile:



    • instead of manually keeping track of column value, you could use enumerate():



      for column, filename in enumerate(glob.iglob("*.txt"), start=1):


      Same idea could be applied for rows.




    • I think you don't have to use get_column_letter() and instead operate the numbers which you have:



      sheet.cell(row=row, column=column).value = line


    • not to say anything against openpyxl, but I personally find xlsxwriter module's API more enjoyable and more feature rich





    Complete improved version:



    import glob

    import openpyxl


    def text_into_spreadsheet():
    """main logic for read .txt into spreadsheet"""
    workbook = openpyxl.Workbook()
    sheet = workbook.active

    for column, filename in enumerate(glob.iglob("*.txt"), start=1):
    with open(filename) as textfile:
    sheet.cell(row=1, column=column).value = filename

    for row, line in enumerate(textfile, start=2):
    sheet.cell(row=row, column=column).value = line

    workbook.save('result.xlsx')


    if __name__ == "__main__":
    text_into_spreadsheet()





    share|improve this answer























    • I think this solution is very very nice. It is alot better to read compared to what i did.
      – Sandro4912
      Dec 19 '18 at 18:17
















    11














    There are a few things we could improve:





    • you could use the "lazy" glob.iglob() to filter out *.txt files instead of doing the os.listdir(), keeping the whole list if memory and having an extra check inside the loop:



      for filename in glob.iglob("*.txt"):



    • instead of using textfile.readlines() and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:



      for line in textfile:



    • instead of manually keeping track of column value, you could use enumerate():



      for column, filename in enumerate(glob.iglob("*.txt"), start=1):


      Same idea could be applied for rows.




    • I think you don't have to use get_column_letter() and instead operate the numbers which you have:



      sheet.cell(row=row, column=column).value = line


    • not to say anything against openpyxl, but I personally find xlsxwriter module's API more enjoyable and more feature rich





    Complete improved version:



    import glob

    import openpyxl


    def text_into_spreadsheet():
    """main logic for read .txt into spreadsheet"""
    workbook = openpyxl.Workbook()
    sheet = workbook.active

    for column, filename in enumerate(glob.iglob("*.txt"), start=1):
    with open(filename) as textfile:
    sheet.cell(row=1, column=column).value = filename

    for row, line in enumerate(textfile, start=2):
    sheet.cell(row=row, column=column).value = line

    workbook.save('result.xlsx')


    if __name__ == "__main__":
    text_into_spreadsheet()





    share|improve this answer























    • I think this solution is very very nice. It is alot better to read compared to what i did.
      – Sandro4912
      Dec 19 '18 at 18:17














    11












    11








    11






    There are a few things we could improve:





    • you could use the "lazy" glob.iglob() to filter out *.txt files instead of doing the os.listdir(), keeping the whole list if memory and having an extra check inside the loop:



      for filename in glob.iglob("*.txt"):



    • instead of using textfile.readlines() and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:



      for line in textfile:



    • instead of manually keeping track of column value, you could use enumerate():



      for column, filename in enumerate(glob.iglob("*.txt"), start=1):


      Same idea could be applied for rows.




    • I think you don't have to use get_column_letter() and instead operate the numbers which you have:



      sheet.cell(row=row, column=column).value = line


    • not to say anything against openpyxl, but I personally find xlsxwriter module's API more enjoyable and more feature rich





    Complete improved version:



    import glob

    import openpyxl


    def text_into_spreadsheet():
    """main logic for read .txt into spreadsheet"""
    workbook = openpyxl.Workbook()
    sheet = workbook.active

    for column, filename in enumerate(glob.iglob("*.txt"), start=1):
    with open(filename) as textfile:
    sheet.cell(row=1, column=column).value = filename

    for row, line in enumerate(textfile, start=2):
    sheet.cell(row=row, column=column).value = line

    workbook.save('result.xlsx')


    if __name__ == "__main__":
    text_into_spreadsheet()





    share|improve this answer














    There are a few things we could improve:





    • you could use the "lazy" glob.iglob() to filter out *.txt files instead of doing the os.listdir(), keeping the whole list if memory and having an extra check inside the loop:



      for filename in glob.iglob("*.txt"):



    • instead of using textfile.readlines() and read all the lines in a file into memory, iterate over the file object directly in a "lazy" manner:



      for line in textfile:



    • instead of manually keeping track of column value, you could use enumerate():



      for column, filename in enumerate(glob.iglob("*.txt"), start=1):


      Same idea could be applied for rows.




    • I think you don't have to use get_column_letter() and instead operate the numbers which you have:



      sheet.cell(row=row, column=column).value = line


    • not to say anything against openpyxl, but I personally find xlsxwriter module's API more enjoyable and more feature rich





    Complete improved version:



    import glob

    import openpyxl


    def text_into_spreadsheet():
    """main logic for read .txt into spreadsheet"""
    workbook = openpyxl.Workbook()
    sheet = workbook.active

    for column, filename in enumerate(glob.iglob("*.txt"), start=1):
    with open(filename) as textfile:
    sheet.cell(row=1, column=column).value = filename

    for row, line in enumerate(textfile, start=2):
    sheet.cell(row=row, column=column).value = line

    workbook.save('result.xlsx')


    if __name__ == "__main__":
    text_into_spreadsheet()






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 18 '18 at 23:02

























    answered Dec 18 '18 at 22:44









    alecxe

    14.9k53478




    14.9k53478












    • I think this solution is very very nice. It is alot better to read compared to what i did.
      – Sandro4912
      Dec 19 '18 at 18:17


















    • I think this solution is very very nice. It is alot better to read compared to what i did.
      – Sandro4912
      Dec 19 '18 at 18:17
















    I think this solution is very very nice. It is alot better to read compared to what i did.
    – Sandro4912
    Dec 19 '18 at 18:17




    I think this solution is very very nice. It is alot better to read compared to what i did.
    – Sandro4912
    Dec 19 '18 at 18:17













    4














    If you're on a Linux/Mac machine, it's as simple as this shell command:



    paste ?.txt


    The ? wildcard will match all your files, A.txt to E.txt in order. The paste command will paste them in parallel columns, separated by TABs.



    You can then open your spreadsheet app and import the text file, and add the header.



    Per a question formerly in comments: Can you auto-generate the header as well?
    Sure:



    for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt


    Also, I'm assuming a single letter before .txt, as in the original example. If you want all files ending in .txt, then it's *.txt instead of ?.txt.






    share|improve this answer




























      4














      If you're on a Linux/Mac machine, it's as simple as this shell command:



      paste ?.txt


      The ? wildcard will match all your files, A.txt to E.txt in order. The paste command will paste them in parallel columns, separated by TABs.



      You can then open your spreadsheet app and import the text file, and add the header.



      Per a question formerly in comments: Can you auto-generate the header as well?
      Sure:



      for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt


      Also, I'm assuming a single letter before .txt, as in the original example. If you want all files ending in .txt, then it's *.txt instead of ?.txt.






      share|improve this answer


























        4












        4








        4






        If you're on a Linux/Mac machine, it's as simple as this shell command:



        paste ?.txt


        The ? wildcard will match all your files, A.txt to E.txt in order. The paste command will paste them in parallel columns, separated by TABs.



        You can then open your spreadsheet app and import the text file, and add the header.



        Per a question formerly in comments: Can you auto-generate the header as well?
        Sure:



        for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt


        Also, I'm assuming a single letter before .txt, as in the original example. If you want all files ending in .txt, then it's *.txt instead of ?.txt.






        share|improve this answer














        If you're on a Linux/Mac machine, it's as simple as this shell command:



        paste ?.txt


        The ? wildcard will match all your files, A.txt to E.txt in order. The paste command will paste them in parallel columns, separated by TABs.



        You can then open your spreadsheet app and import the text file, and add the header.



        Per a question formerly in comments: Can you auto-generate the header as well?
        Sure:



        for f in ?.txt; do echo -en "$ft"; done; echo; paste ?.txt


        Also, I'm assuming a single letter before .txt, as in the original example. If you want all files ending in .txt, then it's *.txt instead of ?.txt.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 20 '18 at 0:09









        Sᴀᴍ Onᴇᴌᴀ

        8,35261853




        8,35261853










        answered Dec 19 '18 at 16:13









        LyleK

        412




        412






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209933%2ftext-file-to-spreadsheet%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Probability when a professor distributes a quiz and homework assignment to a class of n students.

            Aardman Animations

            Are they similar matrix