1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

22,000 comments

Discussion in 'Black Hat SEO' started by mrgrim, Sep 13, 2011.

  1. mrgrim

    mrgrim Junior Member

    Joined:
    Dec 29, 2009
    Messages:
    168
    Likes Received:
    17
    Ok so I have 66,000 comments but I'm trying to clean them, I even tried importing um as HTML file in a browser, but it came out as one line.. They have HTML tags mixed with some markup for styling.. Also there's some numbers mixed into the front of some of the lines(12.) 13.) 14.) etc)along with some quotes. Find and replace wont cut it for this any ideas?

    Edit: 66,000 not 22,000
     
    Last edited: Sep 13, 2011
  2. Nitros

    Nitros Power Member

    Joined:
    Jan 30, 2009
    Messages:
    573
    Likes Received:
    295
    few simple lines in php script should do the work ;)
     
  3. accelerator_dd

    accelerator_dd Jr. VIP Jr. VIP Premium Member

    Joined:
    May 14, 2010
    Messages:
    2,441
    Likes Received:
    1,005
    Occupation:
    SEO
    Location:
    IM Wonderland
  4. mrgrim

    mrgrim Junior Member

    Joined:
    Dec 29, 2009
    Messages:
    168
    Likes Received:
    17
    Awesome!

    The numers are like
    "13.) this comment is the comment of the year, ducks swim Seagull eat baygulls!"
    "14.) {strong} your ass is grass {strong/}.."

    The numbers and tags are what I'm trying to scrub!


    Thanks guys I'll try note pad ++ I use it for other stuff..
     
  5. mrgrim

    mrgrim Junior Member

    Joined:
    Dec 29, 2009
    Messages:
    168
    Likes Received:
    17
    For HTML, not for number bullets...
     
  6. accelerator_dd

    accelerator_dd Jr. VIP Jr. VIP Premium Member

    Joined:
    May 14, 2010
    Messages:
    2,441
    Likes Received:
    1,005
    Occupation:
    SEO
    Location:
    IM Wonderland
    To remove the .) as well, you can modify the regex like this:

    ^[0-9]?.)

    Or when the first regex done, just do a simple find & replace and find ".)" and replace with nothing
     
  7. VIC SEO

    VIC SEO Elite Member

    Joined:
    Feb 19, 2010
    Messages:
    2,156
    Likes Received:
    363
    Gender:
    Male
    Occupation:
    SEO Specialist
    Location:
    iSynergyMedia
    Home Page:
    You can use a scipt in PHP or notepad++ to get the desired result. I understand how it must look like and hope your problem gets solved using this method.