Yioop software provides many of the same features of larger search portals:
Search Results. Yioop comes with a crawler that can be used to crawl the open web or a selection of URLs of your choice. It can index popular archive formats like Wikipedia XML-dumps, arc, warc, Open Directory Project-RDF (now Curlie.org), as well as dumps of emails or databases. Once you have created Yioop indexes of your desired data sources, Yioop can serve as a search engine for your data. It supports "crawl mixes" of different data sources. It supports knowledge wiki callouts related to search results which can be based on Wikipedia or other sources. Yioop also provides tools to classify and sculpt your data before being used in search results.
Media Services. News is best when it is still fresh. Yioop has a media updater process that can be used to re-index RSS and Atom feeds on an hourly basis. This more timely information can then be incorporated into Yioop search results. Yioop's media updater also can be used to recode to mp4 uploaded videos, handle bulk email, and calculate view statistics. Yioop's trending tools can be track how words and values onn news and other pages change over time.
Social Groups, Blogs, and Wikis. Yioop can be configured to allow users to create discussion groups, blogs, and wikis. Blogs and discussion group can be made public or private and posts can be made to expire if desired. Yioop supports an easy to learn ChatBot api to make group feed chat bots.
7.2.124 Aug 2020 18:21
Crawler and Search Engine
* Search results can be presented as continuous scroll or as paged results.
* For privacy tell browser not to display referrer query when clicking on search results.
* Hamburger menu now adds quick access to narrow search by time, language, video duration, etc.
* Landing page can be configured to show trending and news highlights from admin specified subsearches.
* Knowledge Wiki callouts on search results and tool to manually make callouts and another tool to generate them off of Wikipedia dumps.
* Improved crawl result editor that makes it easier to filter results from searches, edit search snippets, and pin urls at top of search results.
* Better display of trending news items, query and group statistics. Better charting of Trending News Items.
* Speed up image loading in news item search results.
* Improved link farm detection.
* Which media updater jobs are running can now be controlled from the UI.
* Adds Tesseract support for OCRing images in PDFs.
* Improved processing of m3u8 for feed podcast downloads
* Add slow start parameter which can be used to make sure get a good copy of seed sites before starting general crawl.
* Fixes critical bug in IP address handling introduced by charges to cURL library. Bug was causing many pages not to get crawled after cURL version changed.
* Improved logging and log file rotation for crawling jobs.
Indexing and Library Functionality
* Feed item storage moved out of database into more scalable FeedArchiveBundle class.
* Adds a notion of direction to index bundle iterators so now can scan through posting lists in both a forward (what 6.0.4 had) and backward direction.
* Improved Trending Calculations
* New segmentator, named-entity recognizer, part-of-speech taggers for Chinese. Improved Chinese stop words.
* Chinese language question answering implemented.
* Fixes issues with Tor and proxy crawling.
Admin, Group, Wiki, and Yioop Interface
* Add configurable cookie consent feature to UI for GDPR.
3.1.104 Sep 2015 19:52
*Adds support for https://www.seekquarry.com/p/Documentation#Keyword 20Advertising Keyword Advertising and its own unique ad keyword pricing model. https://www.findcan.ca Findcan.ca demonstrates this in action and now supports sign up for https://www.findcan.ca/advertise advertisements .
* The keyword advertising system integrates with a https://www.seekquarry.com/adscript payment processing script available for download for a fee. This script uses of Stripe.com to handle credit card transactions.
* Yioop has been rewritten to work with the popular PHP package manager known as https://getcomposer.org/ Composer and Yioop is available from the composer package repository https://packagist.org https://packagist.org . This should make it easier for people to develop projects using Yip's natural language processing facilities.
* Yioop's MediaUpdater process has been rewritten so that it can run in a distributed fashion and now supports recoding to mp4 videos uploaded to the wiki system and group feed system. It also supports sending out notification emails. The latter had been done exclusively by the web app.
* In addition to the centroid-based and ad-hoc web page summarizers, there is also a graph-based summarizer that can be used during crawling.
* Dutch, Hindi, Persian, Portuguese stemmer added
2.106 Mar 2015 18:02
* Fixes some security issues in Version 2.0 with regard to checking allowed activities of a user.
*Improves the accuracy of how Yioop counts the number of documents containing a word or phrase
*Improves email notifications from group feeds.
*Adds number of groups column to manage user lists
*Adds number of users info to manage groups lists
*Fixes a number of places where the Yioop code was generating Notices.
2.0126 Jan 2015 00:42
* New integrated wiki help system throughout software
'''Search and Crawling'''
* Adds Docx support. Now for zipped formats like Office, Yioop can use
a partial Zip extractor to extract content even if whole file not downloaded.
* Adds support for rel canonical meta tag
* Adds French, Spanish, German, Russian stemmers
* Adds support for Gopher protocol
* Word filter plugin can apply domain and url specific rules
* Improved scheduling of page download based on number of DNS lookups
* Improved handling of robots.txt files when site in question is congested
* arc_tool supports count recalculation and url suggestion injection
* Updater now has a scraper for HTML pages with news
* Updater can now extract images from news feeds.
* Updater can now auto-convert video files to mp4 and webm
'''Feeds and Wikis'''
* Adds ability to drag and drop images, video, and other documents in posts and wiki pages
* Besides standard wiki pages, Yioop now supports slide presentation pages, media gallery pages, and page aliases
* +/- Voting available on Group posts
* Can configure so that posts expire after a certain amount of time
* Can use Configure activity to set sites look and feel from icons to background to timezone, etc.
* Can Configure to use wiki system for default landing page
* GUI for adding Ad Server scripts