unstruct.org My unstructured thoughts and rants


Web Search, Extraction & Machine Learning positions at Radar Networks [San Francisco]

Required Skills

  • Java development: Exceptional skills with Java – at least 3 – 5 years of professional (or significant academic) coding experience in Java.
  • Search engines for the Web: Crawling, resource discovery, indexing, harvesting, extraction, using tools such as Lucene, Nutch, Hadoop etc. Experience in scaling search engines to handle massive amounts of data. Experience in using ontologies/taxonomies in search is desired, graph search and social network analysis are also of interest.
  • Experience with modern software engineering practices and paradigms - we want you to develop beautiful code which is a pleasure to look at and is easy to maintain.

Optional Specialized Skills

  • Data Extraction and Harvesting: Harvesting knowledge from unstructured and structured datasets. Entity detection, topic detection, document segmentation and classification. Familiarity with products such as InXight, GATE, UIMA, MinorThird, Mallet, WEKA, and/or other text mining technologies. Natural language processing skills are also a plus.
  • Machine Learning for Search and Classification. Machine learning algorithms to assist with search, classification, clustering, personalization, optimization and data extraction. Supervised, unsupervised, Bayesian learning, SVM, HMMs, graph theory and graph search and vector search algorithms.
  • Semantic Web: Experience the Semantic Web, RDF, OWL, reasoning over semantic data and ontologies.
Filed under: General No Comments

3D photo collages created automatically from photos

Researchers from Microsoft and the University of Washington have created a very impressive way of organizing and indexing unstructured photos. The system extracts distinctive features from the images that are then aligned pairwise. By using all these alignments, the original position of each camera can be estimated. It’s impressive beacuse the system does not need to know the geometry or lcoation of any of the cameras, and any picture can be used for this, not just images from the same camera.

Filed under: General Continue reading