1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Yet Another Simple Question I'll give you REP :)

Discussion in 'BlackHat Lounge' started by Erik Creed, Oct 30, 2010.

  1. Erik Creed

    Erik Creed BANNED BANNED Premium Member

    Joined:
    Jun 19, 2008
    Messages:
    493
    Likes Received:
    438
    Hey all,

    So here's the question:

    I got urls such as this:
    1 http://www.website.com
    2 http://www.website.com
    3 http://www.website.com
    4 http://www.website.com

    Now how do I remove everthing before the http:?????

    This is a list I got after I bought xrumer service on BHW so I got like 15,000+ lines and need to remove everything before the http so that I can PING the urls.

    Also any advice on how to ping 15,000 urls in the most effective way would be awesome.
     
  2. BiggDipper

    BiggDipper Registered Member

    Joined:
    Mar 5, 2009
    Messages:
    76
    Likes Received:
    182
    Occupation:
    bh Marketer
    Use excel,
    1. Paste all urls into excel.
    2. Select the data you want to manipulate
    3. Go to data tab and select "text to columns"
    4. Choose Delimited
    5. Select only the checkbox that is space
    6. Click Next
    7. Click Finish

    Now your data will be seperated and you can delete or keep only the columns that you are after.

    Good Luck
     
    • Thanks Thanks x 1
  3. Ramsweb

    Ramsweb Senior Member

    Joined:
    Mar 31, 2010
    Messages:
    1,121
    Likes Received:
    658
    Occupation:
    Internet Marketer - Self Employed
    Location:
    In front of my PC
    use the text to column function in excel. Google it. It is pretty easy.
     
    • Thanks Thanks x 2
    Last edited: Oct 30, 2010
  4. Fwiffo

    Fwiffo Power Member

    Joined:
    Apr 7, 2010
    Messages:
    562
    Likes Received:
    325
    Occupation:
    Starship Captain
    Location:
    Pluto / Spathiwa
    I'm not sure what you're asking exactly, however i'll try

    you can usually use text functions in excel

    try the "substitute" function

    put your domain in one cell, then
    Code:
    =substitute(A1, "http://", "")
    The "" is a way of putting in a blank space

    that should remove the "http://"

    then pull the cell down for the whole list
     
    • Thanks Thanks x 1
  5. Condorr

    Condorr Newbie

    Joined:
    Sep 25, 2010
    Messages:
    33
    Likes Received:
    7
    What Fwiffo said should work but an even easier way is to use the Find Replace option in text editors like Notepad++. Just Find for "http:" then replace with nothing and that will delete all instances of "http:"

    I dont think the built in Notepad on Microsoft windows supports this so you need to download more advanced editors like Notepad++

    Edit:
    I see you're asking to remove the numbers 1, 2, 3 etc. In that case, use Excel's text to column option like what BiggDipper said.
     
    • Thanks Thanks x 2
    Last edited: Oct 30, 2010
  6. jstover77

    jstover77 Executive VIP Jr. VIP Premium Member

    Joined:
    Feb 27, 2009
    Messages:
    2,313
    Likes Received:
    3,530
    Gender:
    Male
    Occupation:
    Entrepreneur
    Location:
    Doylestown PA
    Home Page:

    Ya this is the really easy way to do it. Push control H to pull it up in notepad. Put in "http" in replace, and nothing in the replace with field. Push replace all and whalaaa.
     
    • Thanks Thanks x 2
  7. Fwiffo

    Fwiffo Power Member

    Joined:
    Apr 7, 2010
    Messages:
    562
    Likes Received:
    325
    Occupation:
    Starship Captain
    Location:
    Pluto / Spathiwa
    that way works better!

    didn't notice the 1, 2, 3 etc. however - in excel I prefer the
    Code:
    =replace() 
    function - just dump in:
    - old text / start number (0 as it's the first number)
    - number of characters in that you want to remove
    - then "" to replace the text with nothing

    i agree that notepad ++ is quicker for most tasks, I'm just a bit of an excel geek and prefer to keep the sheets for when I need to do quick re-dos :)
     
    • Thanks Thanks x 1
  8. voidale

    voidale Jr. VIP Jr. VIP Premium Member

    Joined:
    Sep 29, 2008
    Messages:
    583
    Likes Received:
    176
    the easiest way to do it is to open it in text file and do Ctrl + H (Replace) http:// with blank Thats it ;)

    Sorry I thought u wanted to remove the http:// anyway it can still work with the website but if it's 15k unique that will take shit load of time but I don't see why you need a file with 15k http: only lol
     
    • Thanks Thanks x 1
    Last edited: Oct 30, 2010
  9. bezopravin

    bezopravin BANNED BANNED

    Joined:
    May 11, 2010
    Messages:
    461
    Likes Received:
    3,471
    =======================================================================

    4 Simple Steps to Remove Numbers In front of Texts

    1. Open your list in Notepad++
    2. Press Ctrl + A (To Select All texts)
    3. Navigate to TextFX Menu --> TextFX Tools and Select Delete Line Numbers or First Word
    4. Voila! Your List is Clean!!! :)
    Hope you enjoyed this Tip...

    =======================================================================

    Download Link for Notepad+:

    Code:
    http://sourceforge.net/projects/notepad-plus/files/latest
    Guide Shot :

    [​IMG]
     
    • Thanks Thanks x 7
  10. bhprince

    bhprince Registered Member

    Joined:
    Oct 19, 2010
    Messages:
    57
    Likes Received:
    8
    i would use vim and %s#stringToDelete## in command mode
     
    • Thanks Thanks x 1
  11. THUNDERELVI

    THUNDERELVI Elite Member

    Joined:
    Sep 12, 2009
    Messages:
    2,202
    Likes Received:
    1,729
    Gender:
    Male
    Location:
    W3
    1. Open Notepad, no need for Notepad++
    2. Press Ctrl+A (This will select all text)
    3. Press Ctrl+H (Will open replace window)
    4. Find what: enter http:
    5. Replace with: blank(enter nothing, leave it blank)
    6. Click Replace All
    7. Done...

    And no need to give me another rep:D LOL
     
    • Thanks Thanks x 1
  12. Nuz25

    Nuz25 Junior Member

    Joined:
    Aug 20, 2010
    Messages:
    129
    Likes Received:
    100
    Occupation:
    I'm a student

    I can make you a simple C++ program that will take this list in a file and create another file displaying the initial list in format you need (In this case without numbers in front of the URLs). You won't have to do anything... just run the executable file and get a nice txt file at the end so you can just put it into scrapebox in a matter of seconds. (I think it's the best solution :D)

    Just PM me.
     
    • Thanks Thanks x 1
    Last edited: Oct 30, 2010
  13. Scripteen

    Scripteen Elite Member

    Joined:
    Sep 19, 2009
    Messages:
    1,811
    Likes Received:
    1,918
    Home Page:
    Actually if he wants to use it in scrapebox then he can just import from clipboard and scrapebox will only import urls ;)
     
    • Thanks Thanks x 1
  14. txholdem

    txholdem Elite Member

    Joined:
    Feb 23, 2009
    Messages:
    1,620
    Likes Received:
    193
    Home Page:
    learn regular expression and you can add them back too!
     
  15. khirad

    khirad BANNED BANNED

    Joined:
    Nov 10, 2009
    Messages:
    643
    Likes Received:
    181
    http://www.blackhatworld.com/blackhat-seo/members/112987-bezopravin.html


    Wow u deserve thanks
     
  16. wickedguy

    wickedguy Supreme Member

    Joined:
    Jul 22, 2009
    Messages:
    1,402
    Likes Received:
    1,379
    Location:
    BHW--> South Africa
    Home Page:
    PHP:
    <?

    $data=file("urls.txt");
    //open same file and use "w" to clear file
    $f=fopen("urls.txt","w");
    foreach(
    $data as $line){
    $line=preg_replace('/(.*?)http/si','http',$line);
    fputs($f,$line); //place $line back in file
    }
    fclose($f);
    ?>
     
    • Thanks Thanks x 1
  17. edgematch

    edgematch Elite Member

    Joined:
    May 24, 2010
    Messages:
    2,539
    Likes Received:
    1,949
    Occupation:
    You can never guess!
    Location:
    :noitacoL
    Totally shocked seeing that nearly half of the people are thinking about replacing "http://" with blank "". Thanks bezopravin. Yours is cleaarly the best.

    Now I see why IM wil never ever die.
     
    • Thanks Thanks x 2
    Last edited: Oct 31, 2010
  18. IamNRE

    IamNRE Jr. VIP Jr. VIP Premium Member

    Joined:
    Aug 18, 2010
    Messages:
    4,663
    Likes Received:
    7,108
    Occupation:
    Generate Leads With FB Ads For Just $1
    Home Page:
    Open word...

    Copy paste your list.

    In words 2007 top right corner there is something called find, replace etc.

    Press on replace.

    Then insert this "^# " without "Qoutes but with space"

    Or

    Go "replace"

    press "more"

    press "special"

    choose "any digit"

    simples!

    Rep me :)
     
    • Thanks Thanks x 1
  19. srb888

    srb888 Elite Member

    Joined:
    Jul 30, 2008
    Messages:
    3,260
    Likes Received:
    5,067
    Gender:
    Male
    Occupation:
    WebzSurfer
    Location:
    Sun, Mon, Tue, WTF, Sat!!! :)
    This is quite simple ;). Use Textpad and then do a Search & Replace (Ctrl+H):

    Put this string in the search box (of the Replace window)>>>
    ^.*\?
    and of course, keep the replace box blank

    then hit the replace all button.
    It will delete what you don't want in this instance:

    For example you have plenty of lines of the following text >>>
    http://anonym.to/?http://www.website.com/
    (whatever the "website" url is)
    and you want to remove the [http://anonym.to/?] part, then the above Search & Replace will give you this result>>>
    http://www.website.com/

    Just remember to keep the Regular Expression option ticked.

    HTH :)


    II. To answer the 2nd part of your question:

    I am a humble pinger! lol That means I ping about 25-30 urls at a time and pinging 15k urls is almost unimaginable to me.:D But I use BlogPinger to ping those 25+ urls and it does the job of pinging the list automatically and at set times, and throughout the day if I don't stop it. :) ... But you can always push the program and see. Pushing is always a better exercise, you see! ;)


    The 1st part has already been solved for you in this post; you can also find Textpad very easily (Google it), or else any other sw that works with Regular Expressions in a text file (your list) will also help you.

    And thanks for giving this great opportunity to help someone! lol I feel quite refreshed! lol

    :)
     
    • Thanks Thanks x 1
    Last edited: Oct 31, 2010
  20. Erik Creed

    Erik Creed BANNED BANNED Premium Member

    Joined:
    Jun 19, 2008
    Messages:
    493
    Likes Received:
    438
    Thanks it worked!

    I downloaded from:
    Code:
    http://download.cnet.com/Notepad/3000-2352_4-10327521.html?tag=mncol;1
    I also had a empty line below and above each url so I went to
    TextFX > Text Edit > Delete Blank Lines
    And thus got rid of those...

    Then I had a third problem I had empt space between numbers and http So I went to: Search > Replace
    I pasted in the empty space I had copied arlier and entered nothing in the replace with field, and clicked replace all.

    Great stuff it works....

    Now how Do I ping.. hmmmmmmmmmmmmmmmmmmm

    By the way the dud with the images about notepad++ got a rep, but I'd liek to thank all the others, you've shown some great ambition :)
     
    Last edited: Oct 31, 2010