1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Please help me to import large data files into mySQL database

Discussion in 'General Programming Chat' started by timothywcrane, Mar 31, 2013.

  1. timothywcrane

    timothywcrane Power Member

    Joined:
    Apr 25, 2009
    Messages:
    590
    Likes Received:
    236
    Occupation:
    Internet Promotion Management
    Location:
    USA
    Home Page:
    I am trying to load the infamous 2006 AOL leak data into a db for use in a custom keyword selection toolkit based on MySQL and Workbench, with maybe an extra OOObase layer for fancy reports ;)

    the problem I am having is to find tools to handle the rather large data files. The CSVs are too big to import using either PHPmyadmin or Workbench. If I use a file splitting tool to make the file size manageable then I get column misalignment ruining any chance for an easy, error free import.

    I am grateful for any suggestions..
     
  2. qrazy

    qrazy Senior Member

    Joined:
    Mar 19, 2012
    Messages:
    1,113
    Likes Received:
    1,712
    Location:
    Banana Republic
    If you have access to php configuration files, you can tweak it to increase the memory & timeout limits and still use phpmyadmin to import the files.
     
    Last edited: Mar 31, 2013
  3. sukataetumba

    sukataetumba Senior Member

    Joined:
    May 25, 2010
    Messages:
    1,109
    Likes Received:
    213
    have you tried bigdump?

    Code:
    http://www.ozerov.de/bigdump/
     
    • Thanks Thanks x 1
  4. sxiclub

    sxiclub Registered Member

    Joined:
    Jul 25, 2008
    Messages:
    84
    Likes Received:
    6
    You have to zip or gzip the file, MySQL can import compressed files.
     
  5. BreakAllTheClocks

    BreakAllTheClocks Regular Member

    Joined:
    Apr 23, 2010
    Messages:
    421
    Likes Received:
    242
  6. timothywcrane

    timothywcrane Power Member

    Joined:
    Apr 25, 2009
    Messages:
    590
    Likes Received:
    236
    Occupation:
    Internet Promotion Management
    Location:
    USA
    Home Page:
    Bigdump was a big help. I used a filesplitter go get it down to workable chunks and cleaned the files down to only one column (I only wanted the query data). I simply saved all of the cleaned files as csv files in the bigdump directory, told silkuli to delete the completed documents and go to the next with the visual cue of file completion dialog from bigdump. About 50 5 meg csvs to load and am at about 25. ;)

    I have to say that my SQL is non existent and I am now stuck with a db of keywords that duplicate badly. Is is currently at 20 million and climbing.

    I am researching and have found to build a new table, query for uniques and transfer data, then removing old table. The silver spoonfeeding question of the day is this...Once I remove all of the dupes. Is there anything I need to do to make all records that I enter from now that go into other tables relate to the kw data, such as key ID or such?

    I will be downloading and using bigdump to enter data into other tables for use in formulas that tie in with each keyword.I am thinking that I might do this with MySQL data and an OpenOffice Base frontend, or maybe Workbench. If you have a suggestion for this also from experience I would appreciate it.This DB is going to be the engine for an Automated Adsense Network Keyword Research System. If all goes as planned, I might post updates on it in a journey thread.Thanks for the help to all that offered advice.
     
  7. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,148
    You 're going to use 7 years old data for an adsense keyword research system??
     
  8. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    maybe u should learn basic mysql or whatever DBMS u plan to use... each one has commands for this. for example in mysql you can do a LOAD DATA INFILE command and various other stuff.
     
  9. timothywcrane

    timothywcrane Power Member

    Joined:
    Apr 25, 2009
    Messages:
    590
    Likes Received:
    236
    Occupation:
    Internet Promotion Management
    Location:
    USA
    Home Page:
    Yeah, I plan on using the queries as seeds for the system.
    Now that the seeds are entered, the first stop is still GAWT suggestions for traffic and value.

    "maybe u should learn basic mysql or whatever DBMS u plan to use... each one has commands for this." I agree. DBMS are my weak point.
     
  10. timothywcrane

    timothywcrane Power Member

    Joined:
    Apr 25, 2009
    Messages:
    590
    Likes Received:
    236
    Occupation:
    Internet Promotion Management
    Location:
    USA
    Home Page:

    I went a slightly different route (I understand it takes more resources, but I am on a local system and would rather not ssh in).

    I used:

    I created a clone table (only one column, so simple for me to do ;) 0

    then I

    Code:
    INSERT INTOnewtable
    (
    Column
    )
    SELECT Column
    FROM `oldtable` 
    GROUP BY Column 
    It is still working on the migration, so my fingers are crossed for success.