1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Can one of you help duckduckgo build a better spider/engine either through distributed com

Discussion in 'Black Hat SEO' started by Bostoncab, Jan 3, 2013.

  1. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    Can one of you help duckduckgo build a better spider/engine either through distributed computing or through a new "cpanel" server software

    As the owner of a small business who relies on organic and paid Google traffic to make a living I have grown to hate Google and their "search bubble" "algorithm changes" to produce "better results" and their constant insistence to properly manage "free services" like Google "places" or "maps" whatever the shit they are calling it this week that forces me to spend more and more on paid listings while my competitors rack up fake listings in places that never get taken down no matter how many times I report them.


    I have gone looking for a better search engine to promote and came across duckduckgo.com this site gives awesome relevant results does not track you and has the great feature that there is no "search bubble" meaning the results you see when you search are exactly what I see when I search.


    The only problem with this search engine (aside from their name which although cute is impossible for people to remember thereby limiting its popularity) is they lack the resources to crawl the web on their own and rely on harvested results from other search sources.


    In communicating with the duckduckteam I learned that crawling is expensive for companies as it is resource intense requiring endless server racks and highspeed connections that are not cheap to rent and way too expensive for small companies to purchase.


    So someone needs to come up with a way for the duckduckteam to crawl the entire web cheaply. I have an idea in this regard that I have no ability to initiate on my own ( I lack the know how to cross post to other subredits never mind code my ideas).


    Idea 1
    Use a distributed computing service much like the SETTIathome service that crawls the web using the computing power of a worldwide net of volunteer machines to give the duckduckteam the power they need to be a serious competitor to google
    idea 1A
    Do the same as above and compliment it with a mobile app that harness all the tablets and phones in the world. If this app became widespread the entire web could be crawled in no time at all.


    Idea 2
    The way I understand most search spiders work is they visit a link on a web page then visit all the outbound links on that page and on and on.. .This has an obvious shortcoming in the there is content out there that is not linked to and does not link to anything else.


    The argument could be made that if it does not link to anything and nothing links to it then the content is probably worthless but why take that chance? If our endeavor is to remake search into the best search ever lets do the job.


    There is software called cpanel which is used to manage a rather large chunk of the linux based servers in the world. An add on for cpanel could be coded to crawl all of the content on these servers. If Cpanel won't play along then a Cpanel substitute could be coded.


    Idea 1B
    Using the earlier discussed distributed model it would be easier to crawl every possible IP address in the world for all content on that IP. Every website and device has an IP right? I know ipv6 is coming and this will make the job harder but the Cubs will also eventually win the world series and if we wait long enough Jesus may make a comeback also... meaning ipv6 is a great idea I just don't see it replacing ipv4 anytime soon.


    I am just some taxi driver that sees that search could be make tons better if only someone with the ability to make it better would do it. Duckduckgo is on the right track as far as privacy and all that they just lack the resources to do the job right. Why don't one of you help them out?
     
  2. Zapdos

    Zapdos Power Member

    Joined:
    Oct 22, 2011
    Messages:
    597
    Likes Received:
    708
    Location:
    Eastern North Carolina
    Distributed models have flaws with security unless it uses aggregate results for security.

    If it's only 1 computer doing the "crunching" of a website then the results can be manipulated. If its multiple then you need to have a coordination server to tell multiple machines to visit at the same time and compare results and taking the normalized results as what to submit.
    Then you have the data problem. A search engine relies on its data and a single scraped+parsed page may end up being 10 megs in size and they still have to download it. Nothing is saved.

    Distributed computing for a search engine is useless and a waste of resources for both the company and the users.


    The other problem is, almost no one uses duckduckgo. You hate google? Great. The people who dont make websites love it and don't care about what you think.
     
  3. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    There has to be a better solution then letting Google or another giant have control over....I wont go into a rant I am more or less sure that most BHW users would agree with my viewpoint on Google...

    There has to be a solution


     
  4. Sam Wylde

    Sam Wylde Regular Member

    Joined:
    Jul 1, 2011
    Messages:
    475
    Likes Received:
    76
    Occupation:
    Boss
    Location:
    PA
    I completely agree that there needs to be a good solution to Goolge. I personally use Duck Duck Go and think it's a great search engine.

    Instead of people complaining about Google and acting like they are King, why don't us Black Hatters pull our support for a different engine like DuckDuckGo?
     
    • Thanks Thanks x 1
  5. Zapdos

    Zapdos Power Member

    Joined:
    Oct 22, 2011
    Messages:
    597
    Likes Received:
    708
    Location:
    Eastern North Carolina
    Because they account for maybe 0.01% of a sites visitors if that?

    Nearly all your views don't give a rats ass about google and their search algorithm. The only way to "switch" search engines is to make all visitors referred to from Google go to a landing page telling them to use something else in the future.
     
    • Thanks Thanks x 1
  6. Sam Wylde

    Sam Wylde Regular Member

    Joined:
    Jul 1, 2011
    Messages:
    475
    Likes Received:
    76
    Occupation:
    Boss
    Location:
    PA
    I make over half my income from Yahoo and Bing alone. Google is no way the only way to make money. I'm sure if enough Blackhatworld members with enough pull got behind any engine, it could be quite successful and profitable. As far as I know this hasn't been done before.
     
  7. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    not a bad idea.. code that into a configurable WP plugin and i`ll install it. Here let me help you with the name FITBAK 1.0 lmk when you have it ready please.


     
  8. Zapdos

    Zapdos Power Member

    Joined:
    Oct 22, 2011
    Messages:
    597
    Likes Received:
    708
    Location:
    Eastern North Carolina
    Mind giving the figures on DuckDuckGo income or any of the ones that are not from a major company? Google, Bing and Yahoo all have billions of dollars behind them.
     
    Last edited: Jan 4, 2013
  9. thejake

    thejake Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 13, 2009
    Messages:
    685
    Likes Received:
    828
    lel ok :D
     
  10. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    Sorry its not a very popular search engine even though it is more or less the best one. This is my opinion but if you use it for a bit you will see that it gives quality results only displays one ad block that you can actually opt out of and has no bubble. If you want to test it let me know and I will search for whatever term you desire and we will compare screen shots. I am not saying we should pimp duckduckgo because it has more users then bing,google or whoever I am saying we should pimp it because it is better and they actually do no evil unlike the evil pieces of sh... Google.

    At least I know you read it. I post on r/stealmyidea first before realizing BHW probably has someone with the skill to accomplish something.
     
  11. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    I know I am making money from bing and yahoo results. There are keyword searches on Google where I am not in the top 10 and am number one on bing and yahoo. When I am not paying Google I still get business based on those searches..albeit not much business. It must be coming from yahoo and bing users.

    I actually just discovered SIRI likes me also as she is often giving out my phone number.
     
  12. Zapdos

    Zapdos Power Member

    Joined:
    Oct 22, 2011
    Messages:
    597
    Likes Received:
    708
    Location:
    Eastern North Carolina

    I'm not saying it shouldn't be promoted because of lack of users, far from. Usually anything that has the majority or a large slice of the market is normally the worst. I'm just saying it's impractical to use distributed computing to help them as it would create more problems than solve. For them to compete they need to create more efficient crawlers, storage systems, more servers and more bandwith. For them to become more used they need better advertising/pr.

    Also, I did test DuckDuckGo after you posted. I like the concept for their DuckDuckHack program. Some developers could do some very interesting things with it to make the search experience easier. My problem with DDG though is the design. I've used google for a long time and DDGs results just look too odd. I may end up using it for the hacks already implemented but meh. The only way I'll start optimizing for anything other than Google is if DDG can get more than a 1% or 2% share of my possible viewers.
     
    • Thanks Thanks x 1