1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Need ALL WP URL to txt file

Discussion in 'Blogging' started by freeufcdotinfo, Apr 23, 2011.

  1. freeufcdotinfo

    freeufcdotinfo Power Member

    Joined:
    Jun 12, 2008
    Messages:
    685
    Likes Received:
    155
    Is this possible other than doing it manual - need every URL from domain to a txt file.
     
  2. PlanetSEO

    PlanetSEO BANNED BANNED

    Joined:
    Dec 20, 2010
    Messages:
    279
    Likes Received:
    403
    Im not sure its what you looking for but you can try this one :

    Code:
    http://wordpress.org/extend/plugins/wp-web-scrapper/
    Also just google it there is tons of results :)

    And im not that familiar with Scrapebox although i have it just not using it to much but i know you can do it with that.
     
  3. freeufcdotinfo

    freeufcdotinfo Power Member

    Joined:
    Jun 12, 2008
    Messages:
    685
    Likes Received:
    155
    Thanks but I dont think this will do what I want.

    I have several wp blogs - each with 1000's of posts

    each post has url such as

    domain.com/blahblah.htm

    need the urls exported to txt file
     
  4. PlanetSEO

    PlanetSEO BANNED BANNED

    Joined:
    Dec 20, 2010
    Messages:
    279
    Likes Received:
    403
    So in this case i think Scrapebox is your best answer, they have a great url scraper/harvester.

    Although i cant really tell you have to do that since im not using this tool

    Good luck
     
  5. clau82

    clau82 Junior Member

    Joined:
    Aug 9, 2009
    Messages:
    158
    Likes Received:
    23
    just add a sitemap plugin
     
  6. crazyflx

    crazyflx Elite Member

    Joined:
    Nov 9, 2009
    Messages:
    1,674
    Likes Received:
    4,825
    Location:
    http://CRAZYFLX.COM
    Home Page:
    ScrapeBox has a sitemap scraper.

    If your WP blog doesn't already have a sitemap, simply install this plugin (it's free):

    http://wordpress.org/extend/plugins/google-sitemap-generator/

    It takes about two seconds, and will build a sitemap for your site which lists every URL your WP site has.

    Then use the SB "sitemap scraper" addon to scrape the sitemap, which it will then export every URL it scraped into a .txt file.

    All in all, this will take less than 3 minutes, from start to finish.
     
  7. Dalila

    Dalila Newbie

    Joined:
    Jul 16, 2010
    Messages:
    31
    Likes Received:
    4
    I usually use Scrapebox for this function, and if you don't have it...GET IT! its by far one of the best SEO program
     
  8. _Chip_

    _Chip_ Senior Member

    Joined:
    Jun 28, 2009
    Messages:
    847
    Likes Received:
    256
    Occupation:
    Student
    Location:
    Depends on my vpn
    i do the EXACT same thing! :)

     
  9. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    Here is a simple PHP script for people who don't have / don't want to use Scrapebox.

    1) Install the xml sitemaps plugin as above ( http://wordpress.org/extend/plugins/google-sitemap-generator/ )
    2) Upload this script to one of your domains.
    3) Load the script in your browser and enter the url to your sitemap. The script will fetch the sitemap and display a plain text list of all the urls.

    PHP:
    <?php


    // Sitemap url extractor
    // Upload this to your hosting to extract plain text urls from your sitemap
    // You must install the xml sitemaps plugin to generate the sitemap for your blog
    // Then enter the sitemap url in the form and press submit button

    // pwned by Autumn @ BHW

    // Fetch sitemap and extract urls
    if($_POST['save'] == 'yes') {

        
    // Fetch the sitemap
        
    $sitemap_url $_POST['sitemap_url'];
        
    $html file_get_contents($sitemap_url) or die("ERROR: Could not fetch sitemap");
        
    //echo "$html\n";

        // Strip out urls
        
    preg_match_all('/<loc>([^<]*)<\/loc>/imsU'$html$regs);

        
    // Print urls if found
        
    if($regs) {
            
    //echo "<pre>";
            //print_r($regs);
            //echo "</pre>";

            
    foreach($regs[0] as $url) {
                echo 
    "$url<br>\n";
            }

        } else {
            
    // Otherwise print error message and original xml
            
    echo "ERROR: No urls found<br>\n";
            echo 
    "Here is your original xml:<br>\n";

            
    // Convert angle brackets for each display
            
    $html preg_replace('/</''<'$html);
            
    $html preg_replace('/>/''>'$html);

            
    // Echo original xml
            
    echo "<pre>\n";
            echo 
    "$html\n";
            echo 
    "</pre>\n";
        }


        exit;
    }




        
    // Print form to enter the sitemap url
        
    echo '<html>
    <head>
    <title>Sitemap to url extractor</title>
    </head>
    <body>
    Enter your sitemap url eg. http://www.yourdomain.com/sitemap.xml :<br>
    <form action="'
    $_SERVER['PHP_SELF'] .'" method="post">
    <input type="hidden" name="save" value="yes">
    <input type="text" name="sitemap_url" size="80">
    <input type="submit" name="submit" value="Extract urls from sitemap">
    </form>
    <br>
    </body>
    </html>'
    ;


    ?>


     
    • Thanks Thanks x 1
  10. crazyflx

    crazyflx Elite Member

    Joined:
    Nov 9, 2009
    Messages:
    1,674
    Likes Received:
    4,825
    Location:
    http://CRAZYFLX.COM
    Home Page:
    Nice one Autumn!

    (You following me? ;) )
     
  11. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    [​IMG][​IMG]
    [​IMG]

    :D
     
  12. webnise

    webnise Regular Member

    Joined:
    Dec 21, 2009
    Messages:
    296
    Likes Received:
    121
    Location:
    London
    Rename to .php and upload the attached file to your wp-content folder so that you can access it like yousite.com/wp-content/postlinks.php

    It will then give you the page urls, page titles, page tags of ALL the pages of your blog.

    You can then save it as txt on your computer and import it into excel for further filteration if you like.

    I hope this helps.
     

    Attached Files:

    • Thanks Thanks x 1
  13. Usman Nasir

    Usman Nasir Registered Member

    Joined:
    Sep 3, 2010
    Messages:
    68
    Likes Received:
    6
    Home Page:

    this thing helped me as well ;)
     
  14. neroweb

    neroweb Registered Member

    Joined:
    Aug 3, 2010
    Messages:
    65
    Likes Received:
    12
    The script from Autumn above works like a charm.. I've been looking for something like this. Thanks Autumn.:D