title: "search engines - which to choose?"
date: 2021-09-14

hello, i think you know who i am! today we have an interesting subject to deal with, remembering that this 'war' is to seek an ideal search engine for privacy.

the problem of have no own index:

today one of the search engines mentioned here depend on others for their results. by "others" i mean google, bing or yahoo, who are three massive violators of privacy.
come on, why do i say this? when you search (we'll use ecosia as an example), the query is first sent to ecosia, which forwards it to bing, which returns the result to ecosia, and finally to you.
there are several problems with this configuration:

*  regardless of engine policy, search terms are being stored on violators' servers and may contain sensitive information.
*  the engine must enter into an agreement with the violator to obtain its results, and the content of that agreement is not disclosed to you and may include data collection.
*  results will still be censored if that is the violator's policy.
*  the engine can be blocked at any time by the violator.

this is no different than using like a invidious, nitter or others clients for their respective services, the violators still deals. and we, keep coming back, hoping to avoid harm.
but as we rely more and more on such applications, it becomes even more difficult to abandon them. 


an interesting new search engine dedicated to lean, personal, old-school websites.
really recommend keeping an eye on that one and you can literally support the creation of a better and secure internet by submitting websites to it (if you know a good one, you should really do that, serius). no big bodies allowed on wiby!
does not need javascript (actually, there is not scripts on its page). keeps logs for 48 hours (archive), although.


the only one using entirely their own crawler and it's reflected directly in the bad search results.
for example, specific technical or scientific answers often won't be found.
although, you do avoid google's (and to a lesser extent bing's) deranking of alternative content and the unjustified promotion of mainstream big corpo media.
mojeek's privacy isn't all that great, logs contain the time of visit, page requested, possibly referral data, and browser information, nothing duration is specified.
at least there's no third party sharing and ip addresses are not stored, though this used to have a caveat:

*  a search query is deemed related to illegal and unethical practices relating to minors, then the full log including visiting ip address will be kept and gladly handed over to any official authorities that ask.

the quote has been recently removed from the privacy policy, so i assume there's no more targeted surveillance.

despite the bad search results, this engine deserves consideration for being the only one with its own index that even cares remotely about your privacy.
another point in your favor is that javascript is not necessary even for images.
i used to dispense mojeek because i assumed they would tell you for any search query they do not like.
all in all, we finally have an independent research mechanism that is relatively private and does not censor content.


searx is an open-source proxy for search engines (mostly violators such as google or bing, but also mojeek and others) that anyone can run on his machine.
some instances include snopyta's (onion), and searx.me (warning: not all of them necessarily support your privacy).
the whole point of searx is to choose which engines you want to use for the results, without sending the requests directly to them.
problem number one: the instances are often blocked by the engines themselves.
number two: the results are often very weak despite choosing big providers.
three: the results are mixed in weird ways, such as a full page of results from a single engine, or the same result repeated a few times.
all these problems seem to depend on the instance used.
searx used to have an extremely annoying bug where the results didn't go beyond first page, -but doesn't seem to anymore- and still does (but again, depends on instance, settings, time of usage... some instances don't seem to have that problem).
here's snopyta's instance privacy:

*  what data do you collect? - we collect as few as possible. for example, we don't log ip addresses or search queries you make with searx.

now, these two instances are not very good in terms of usability (they suffer from search engines going down often, as well as the dreaded "no results beyond first page" issue).
inspect the instance list to perhaps find a better one.

the available search categories are general, files, images, it, map, music, news, science, social media and videos and you can choose which search providers (over 70 available) are used to display the results for any of them.
there are many other options like enforcing https, removing trackers from returned urls, changing the theme, content filtering, etc.
searx is also integrated with the wayback machine this means you can check out the sites without connecting directly to them, even if they are not available anymore.
the service does not require javascript for its functionality at all you can even save settings through a bookmarked url without either javascript or cookies enabled.