for files larger than the ram you have you have to work by "streaming" with a reader, that way you load in memory only the lines you are working with, also do some multihread work to speed it up, or with premises, you also have to keep in a variable the "last line" treated, for some big files it can take a week of 100% of all the cores. You can also simply fragment the files and then work on them, you can firstly do a loop to split the big file to 4gb file each and then work on each file, this way you can interupt and suffer less loss in case of power issue or needing you computing power for something else. those are all tips of how to handle that kind of situation and is relatif to global knowledge on how to handle very big files