Large Scale Data Mining using Genetics-Based Machine Learning

Below you may find the slides of the GECCO 2009 tutorial that Jaume Bacardit and I put together. Hope you enjoy it.
Slides
Abstract
We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes […]

Related posts:

  1. Observer-Invariant Histopathology using Genetics-Based Machine Learning
  2. Deadline extended for special issue on Metaheuristics for Large Scale Data Mining
  3. [BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)

Below you may find the slides of the GECCO 2009 tutorial that Jaume Bacardit and I put together. Hope you enjoy it.

Slides

Abstract

We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.

This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.

Related posts:

  1. Observer-Invariant Histopathology using Genetics-Based Machine Learning
  2. Deadline extended for special issue on Metaheuristics for Large Scale Data Mining
  3. [BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)

Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

Below you may find the slides I used during GECCO 2009 to present the paper titled “Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre”. An early preprint in form of technical report can be found as an IlliGAL TR No. 2009001 or the full paper at the ACM digital library

Related […]

Related posts:

  1. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  2. Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
  3. Scaling Genetic Algorithms using MapReduce

Below you may find the slides I used during GECCO 2009 to present the paper titled “Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre”. An early preprint in form of technical report can be found as an IlliGAL TR No. 2009001 or the full paper at the ACM digital library

Related posts:

  1. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  2. Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
  3. Scaling Genetic Algorithms using MapReduce

NIGEL 2006 revisited (Part VI): Bacardit

This is the last of the NIGEL talks NIGEL 2006 talks. Enjoy this last one

Jaume Bacardit

Video
[vimeo clip_id=5065758 width=”432″ height=”320″]

Slides
[slideshare id=1384657&doc=nigel-2006-bacardit-090504154202-phpapp02]

NIGEL 2006 revisited (Part V): Bernadó and Lanzi

After a brief break, the two last rounds of talks are coming. This week two more NIGEL 2006 talks. Enjoy this fifth release, Bernadó vs. Lanzi.

Ester Bernardó-Mansilla

Video
[vimeo clip_id=5065762 width=”432″ height=”320″]

Slides
[slideshare id=1384643&doc=nigel-2006-bernado-090504153926-phpapp02]

Pier Luca Lanzi

Video
[vimeo clip_id=5065667 width=”432″ height=”320″]

Slides
[slideshare id=1384584&doc=nigel-2006-lanzi-090504152951-phpapp02]

SIGEVOlution Volume 3, Issue 3, Out Now!

The new issue of SIGEVOlution is now available for you to download from:
http://www.sigevolution.org

The issue features:

An Interview with John H. Holland with an introduction by Lashon Booker
It’s Not Junk! by Clare Bates Congdon, H. Rex Gaskins, Gerardo M. Nava & Carolyn Mattingly
car racing @ CIG-2008
GECCO-2009 competitions
new issues of journals
calls & calendar

The new issue of SIGEVOlution is now available for you to download from:
http://www.sigevolution.org

The issue features:

  • An Interview with John H. Holland with an introduction by Lashon Booker
  • It’s Not Junk! by Clare Bates Congdon, H. Rex Gaskins, Gerardo M. Nava & Carolyn Mattingly
  • car racing @ CIG-2008
  • GECCO-2009 competitions
  • new issues of journals
  • calls & calendar

NIGEL 2006 revisited (Part IV): Llorà and Casillas

This week two more NIGEL 2006 talks. Enjoy this third release, Llorà vs. Casillas.

Xavier Llorà

Video
[vimeo clip_id=4727857 width=”432″ height=”320″]

Slides
[slideshare id=1384570&doc=nigel-2006-llora-xeccs-090504152642-phpapp01]

Jorge Casillas

Video
[vimeo clip_id=4727943 width=”432″ height=”320″]

Slides
[slideshare id=1550779&doc=nigel-2006-casillas-090608160722-phpapp02]

NIGEL 2006 revisited (Part III): Butz and Barry

This week two more NIGEL 2006 talks. Enjoy this third release, Butz vs. Barry.

Martin Butz

Video
[vimeo clip_id=4593358 width=”432″ height=”320″]

Slides
[slideshare id=1384628&doc=nigel-2006-butz-090504153553-phpapp02]

Alwyn Barry

Video
[vimeo clip_id=4727803 width=”432″ height=”320″]

Slides
[slideshare id=1384652&doc=nigel-2006-barry-090504154054-phpapp01]

NIGEL 2006 revisited (Part II): Booker and Dasgupta

This week two more NIGEL 2006 talks. Enjoy this second release, Dasgupta vs. Booker.

Dipankar Dasgupta

Video
[vimeo clip_id=4592273 width=”432″ height=”320″]

Slides
[slideshare id=1384601&doc=nigel-2006-dasgupta-090504153353-phpapp01]

Lashon Booker

Video
[vimeo clip_id=4592087 width=”432″ height=”320″]

Slides
[slideshare id=1384637&doc=nigel-2006-booker-090504153739-phpapp02]

Transcoding NIGEL 2006 videos

Last week Pier Luca Lanzi was visiting IlliGAL. Yesterday, before he left for Chicago, we went for one last brunch.  He mentioned that he liked a lot the videos we shot during NIGEL 2006. Thinking about it we agreed would be useful to recover the videos and upload them into some of the usual video […]

Related posts:

  1. NIGEL 2006 Part VI: Bacardit
  2. NIGEL 2006 Part V: Bernardó vs. Lanzi
  3. NIGEL 2006 Part IV: Llorà vs. Casillas

Last week Pier Luca Lanzi was visiting IlliGAL. Yesterday, before he left for Chicago, we went for one last brunch.  He mentioned that he liked a lot the videos we shot during NIGEL 2006. Thinking about it we agreed would be useful to recover the videos and upload them into some of the usual video sharing site suspects. Currently they are hosted, for long term storage purposes, at NCSA’s web archive. I spent sometime retrieving them from the archive (they are pretty fat and encoded in wmv) and I stated transcoding it in m4a. My plan? Make them available via Vimeo and LCS & GBML Central. Also, I will be uploading the presentation slides to SlideShare and also make them available via LCS & GBML Central.

Update: The first two videos (Wilson and Goldberg) are already available at LCS & GBML Central.

Related posts:

  1. NIGEL 2006 Part VI: Bacardit
  2. NIGEL 2006 Part V: Bernardó vs. Lanzi
  3. NIGEL 2006 Part IV: Llorà vs. Casillas

Design and Development of Videogames @ the Politecnico di Milano

The Dipartimento di Elettronica e Informazione together with Centro
METID organized series of four meetings on the design and development of
videogames.
The brochure is available here
The meetings, hosted at the Educafe, will involve presenters from the
videogaming industry and the Center for Computer Games Research (CGR) of
the IT-University of Copenhagen.
Calendar

April 23 – 13:00 – Design, Progettazione e […]

The Dipartimento di Elettronica e Informazione together with Centro
METID organized series of four meetings on the design and development of
videogames.

The brochure is available here

The meetings, hosted at the Educafe, will involve presenters from the
videogaming industry and the Center for Computer Games Research (CGR) of
the IT-University of Copenhagen.

Calendar

  • April 23 – 13:00 – Design, Progettazione e Sviluppo di Videogiochi in
    Italia: l’esperienza di Milestone (in Italian)
  • May 14 – 13:00 – Games @ ITU: Study, Development, Research, Center for
    Computer Games Research, Copenhagen, Danimarca (in English)
  • June 25 – 13:00 – Videogiochi: dal linguaggio simbolico alla
    rappresentazione della realtà (in Italian)
  • July 9 – 13:00 – To be announced

Location

Educafe – Cloister North Building
Politecnico di Milano, Piazza Leonardo da Vinci, 322

Organizers

Pier Luca Lanzi – lanzi@elet.polimi.it
Daniele Loiacono – loiacono@elet.polimi.it
Dipartimento di Elettronica e Informazione

Support

Augusto Buzzi & Damiano Zanzarelli
Centro METID