solo

Solo是一个元搜索引擎,即从现有搜索引擎中提取数据的程序。

统计

留言簿(1)

相关链接

阅读排行榜

评论排行榜

2007年12月17日 #

Solo用途

  1. 产品即时比价,为购买产品提供参考;
  2. 产品规格收集;

posted @ 2007-12-27 09:43 solo 阅读(205) | 评论 (0)编辑 收藏

20071220截图

结果页面


 

posted @ 2007-12-20 22:57 solo 阅读(217) | 评论 (0)编辑 收藏

TODOs and Issues

TODO List:

  1. Identify attributes in a web page
  2. Deal with multiple attributes in a single line while comparing
  3. Show already mapped attributes in compare dialog
  4. Filter "related product" area to reduce # of hunks (by identifying instance URLs in webpage)
  5. Use 3rd party (oss) Java diff library, to remove "org.eclipse.compare" dependency
  6. [Web]Add attribute value filtering options in search page
  7. Add "washer" or "MessageFormat" to attribute entry
  8. Specify whether an attribute is long text (e.g. description) or image URL
  9. Add popularity property to ore, evaluate it by speed, usage, etc.
  10. Solo data partition
  11. Show downloading progress bar in web interface
  12. Added order property to Attribute
  13. Result page columns categorized by ores
  14. Give different thread pool size to user according to his level, default = 3
  15. Ores of a category should be derived, like attributes inheritance
  16. Solve the problem that one ore maps attributes differently in different categories
  17. Model advanced search of ores
  18. Automatically discover search url pattern of ores
  19. Convert relative HREFs to absolute so that they can be recongnized by instance url pattern
  20. Add test query keyword for Category (or Ore) as an attribute, for easy testing purpose
  21. Ability to map multiple attributes in web page to one
  22. Package as rcp product
  23. Mark as "not available" for an attribute of ores
  24. Cache most recent downloaded web pages, for re-compare purpose
  25. Remove tag content to reduce hunks
  26. Remove unique content in product url to reduce hunks

Issue List:

  1. [Desktop]Concurrently download test pages in comparing dialog.
  2. Remove org.eclipse.swt dependency from solo model
  3. Instance url pattern of Ore should be multiple (allow an ore has multiple instance url pattern)
  4. Use relative path for default.solo
  5. Clear prior mapping when an attribute is assigned again, provide "remove mapping" button
  6. Add progress indicator for attribute extraction dialog while refresh comparison area
  7. Add as test instance URL when two URLs are entered to be compared
  8. Allow mapping multiple attributes in mapping dialog without pressing OK button
  9. Add add/remove category/attribute function
  10. Provide category selection function in editing ore dialog
  11. Replace compare area with Table for better performance

posted @ 2007-12-17 19:39 solo 阅读(280) | 评论 (0)编辑 收藏