Instructions:
Windows:
Change the file to a .csv:
- Download/open Notepadd++
- Right click on the file you want to change and select open with Notepad++
- Go to File>Save As
- Type in the file name with .csv at the end, make sure “Save as type” is updated to “All Types”
- Then save
Split the .csv into multiple files:
- Download a program called GitBash here: https://git-scm.com/download/win
- Follow the installation instructons
- Open GitBash
- Change your file directory to the location of your file by typing cd “C:\Users\username\Documents” into the command line - the quotes are important to keep in this line
- NOTE: For to copy use the SHIFT key and the arrows to highlight th text and hit ENTER to copy, then to paste into GitBash hit SHIFT and INSERT at the same time
- Then to break apart the file type split filename PreText -b 20m -a 5 -d, this will split your files into 20MB sizes
- Blue - That’s the command that you’re trying to complete
- Orange - That is the name that will be given to your file in the output (default is x)
- Purple - This is the command that tells the program how to split the file
- -b = break it up by file size (20M, 20K, etc)
- -l = break it up by number of lines (1000, 500, etc - the default is 1000)
- Green - This tells the program how many characters to use as a suffix in the output naming convention (1, 3, 5, etc)
- Red - This says to make the suffix numbers and not letters
- Once you hit enter, the program will immediately start generating your new split files in the same location as the original
- Finished!
Mac:
Changing the file to a .csv:
- Open terminal
- Go to directory location
- Example: If your file is saved in your downloads file, change to the directory using the "cd" command. You would enter "cd Downloads" and hit the “Return” key on your keyboard. You’ll then be in the right directory to change your file from a txt to a csv.
- Type in: sed 's/ \+/,/g' oldfile.txt > newfile.csv
- Type in the name of your file, don’t use “oldfile / newfile.”
Splitting the file into multiple files:
- Open Terminal
- Click on Search on the top right corner as indicated by 1
- Enter “Terminal” and click on the highlighted app (as indicated by 2 on image below)
- Change to the directory where the CSV file is located, using the "cd" command. For example, if the file is in the "Downloads" folder, you would enter "cd Downloads" and hit the “Return” key on your keyboard
- Hint: You must see “Downloads” before the % on the next line after hitting “Return” key on your keyboard
- Type "split -l [number of lines] [file name]" and hit the “Return” key on your keyboard.
- Replace
- "[number of lines]" with the number of lines you want in each resulting file, and
- "[file name]" with the name of the CSV file.
- For example, My file name = longoriginalfile.csv that has 55K lines
- I want each file to be not more than 30K lines so my query is → split -l 30000 longoriginalfile.csv
- After hitting “Return” on query above; there is no message on terminal however you see new files created in the folder that you are in.
- Example: in my downloads folder I see 2 new files → xaa and xab. One file has 30K lines and the other has remaining 25K.
- Finished!
Comments
0 comments
Please sign in to leave a comment.