Published on 24/04/2014

connectaenginyeriaM’han demanat si podria ajudar a difondre la Beca Carnet Jove Connecta’t a la Enginyeria, una de les onze beques del Carnet Jove aquest 2014 amb l’objectiu de  promoure la participació i inserció dels joves al món professional. Només faltaria!  enginyers  i enginyeres és del que més falta a aquest país!

La Beca Connecta’t a la Enginyeria ofereix una estada formativa d’un any en una empresa referent del sector amb una dotació de 12.000€. La convocatòria és oberta fins al 28 de maig de 2014. Més informació la podeu trobar en aquesta pàgina. Vinga, si tens el carnet jove, t’animes?

Spark: Big Data Analytics Beyond Hadoop

Published on 20/04/2014


Hadoop is definitely the de-facto standard for large scale data processing across nearly every industry and enterprise. However, while  ”Volume”, “Variety” and “Velocity” of data increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics.  As we saw in our Technology Basics  for Data Scientist course, the scientific community is offering alternatives like Storm framework that provides event processing and distributed computation capabilities open sourced by Twitter. Storm uses custom created “spouts” and “bolts” to define information sources and manipulations to allow batch, distributed processing of streaming data.  A Storm application is designed as a topology of interfaces which create a “stream” of transformations. It provides similar functionality as a MapReduce job with the exception that the topology will theoretically run indefinitely until it is manually terminated. Hortonworks, one of the main players in the Hadoop distribution arena is working to integrate the Storm stream-processing engine with its Hadoop distribution.

What are doing the other two main players in the Hadoop distribution arena (Cloudera and MapR)?   They chose integrate and support another, the Spark data-processing platform, an open-source (Apache Top-Level Project from February 2014) processing engine originally developed in the AMPLab at UC Berkeley  (with over 100 contributors from 30+ organisations) built around speed, ease of use, and sophisticated analytics.

Spark is an in-memory data-processing platform that is compatible with Hadoop data sources. It’s particularly well suited for machine learning works, as well as interactive data queries, and includes APIs in Scala, Python and Java. Spark is compatible with Hadoop’s Distributed File System (HDFS), HBase, and any Hadoop storage system, so the existing data is immediately usable in Spark.

Spark is in use at a number of large web companies and web startups already, and a startup called Databricks that aims to commercialise Spark was created by a team of professors:  Co-founder and CEO Ion Stoica (University of California, Berkeley professor and co-founder and CTO of Conviva); Co-founder and CTO Matei Zaharia (MIT professor); Ben Horowitz (general partner at Andreessen Horowitz and former Opsware co-founder and CEO); and Scott Shenker (University of California, Berkeley professor and former Nicira co-founder and CEO). Spark recently  received $14M in Venture Round funding.

MapR plans to support the entire Spark stack, which includes the Shark SQL query engine (it’s essentially a faster Apache Hive) and MLLib machine learning library. In the other hand Cloudera does not support Shark (probably that’s because Cloudera is still incentivized to push its Impala SQL query engine).

I think we should be alert to its evolution since I am sure that Spark will play an important role in the Big Data arena, and we will see very soon a lot of analytic applications using Spark as their engines.

I hope this post has been helpful to my students in order to increase their knowledge in this interesting big data framework. In a next post I will introduce in more detail the Spark ecosystem.

Drones: Google acquires a new startup

Published on 15/04/2014




Google acquired a maker of solar-powered drones, the start-up Titan Aerospace, a company that builds large drones that rely on solar power to stay in flight for years. As I read they can carry without problem equipment to carry out applications like mapping, tracking and communication.  It seems that Google will have Tital Aerospace work closely with  the project Loon, which uses high-altitude balloons to provide internet. I read at Gigaom that the drones can deliver internet at speeds up to 1 gigabit per second.  But the drones could also be a useful to Google Earth, because drones could generate images that are refreshed more frequently that images taken by sattelites , opening up new possibilities for Google Earth.  And anything else that we can not even imagine! His imagination is enormous, right?

As I always say to my students, things move fast! There, outside the class, everything is possible for you, it is your chance!

Are you wondering if I am interested in drones? Of course!  An exciting topic in the Big Data world! Here two pictures from last week! ;-)


Google ha comprado un fabricante de drones de energía solar, la start-up Titan Aerospace, una compañía que construye drones que se alimentan con energía solar para mantenerse en vuelo durante años. He leído que pueden transportar sin problema equipos para llevar a cabo aplicaciones de cartografía, de seguimiento o de comunicación. Parece que Google tendrá  a la nueva empresa trabajar en estrecha colaboración con el proyecto Loon, que utiliza globos de gran altitud para proporcionar internet. Leí en Gigaom que los drones podrán ofrecer Internet a velocidades de hasta 1 gigabit por segundo. Además los drones también podrían ser útiles para Google Earth, ya que los drones podrían generar imágenes que se actualizan con más frecuencia que las imágenes actuales tomadas por sattelites, abriendo nuevas posibilidades para Google Earth. Y cualquier otra cosa que no nos podemos ni imaginar! Su imaginación es inmensa, verdad?

Como siempre digo a mis alumnos, las cosas se mueven rápido! Allí, fuera de la clase, todo es posible para vosotros, es vuestra oportunidad!

Se preguntarán si estoy interesado en drones. Of course! Aquí dos instantáneas! Un tema apasionante en el mundo Big Data! No creen?  ;-)



Thank you to David Tous from safsampling.com for his excellent introduction to Drones world!

Hadoop distribution: Main Players-Actores principales

Published on 05/04/2014



Apache Hadoop is the most popular framework used for processing large amounts of data in the Big Data arena. It is clear that Hadoop is here to stay. That is why I always suggest to my students that it is important to know how it works. For the courses I teach where we do not have lab sessions I produced this hands-on for a quick glimpse. If you are interested in learning more about Hadoop you can start with this hands-on that includes some bibliographic references.

Some former students and friends who are already in the industry have asked me for a recommendation of some of the distributions available in the market. Each distribution is different and as a researcher I do not have an in-depth knowledge of all of them. In my work I deal mainly with Hadoop internals. However, in my opinion, I would consider the following three as the main players in the Hadoop distribution arena: Cloudera , Hortonworks and MapR. Obviously we cannot ignore players such as IBM, Teradata, Oracle, Microsoft, Amazon , among others, that include Hadoop in their software stacks.

I recommend these three primarily because all of them offer free versions (each will have some level of restriction). Also each vendor offers VM images with Linux and Hadoop already installed which makes the installation process much easier.

However this scenario will change very quickly due to the dynamism in company mergers and acquisitions in this field.

I hope you find this information useful. Good luck!


Apache Hadoop es el entorno de ejecución más popular utilizado para el procesamiento de grandes cantidades de datos en el campo de Big Data . Tengo claro que Hadoop está aquí para quedarse, por ello siempre sugiero a mis alumnos que es importante saber cómo funciona y les propongo que experimenten. En los cursos que imparto y que no tienen sesiones de laboratorio les propongo a los estudiantes que usen este hands-on para una primera ojeada (en inglés).  Si están interesados en introducirse en Hadoop puede comenzar con este hands-on en la que incluyo algunas referencias bibliográficas.

Algunos antiguos alumnos y amigos que ya están en la industria me piden a menudo que le recomiende alguna de las distribuciones disponibles en el mercado. Pero cada distribución es diferente y como investigador no tengo un conocimiento en profundidad de cada una de ellas, puesto que en mi día a día me centro en internals de Hadoop. Sin embargo , puestos a dar mi opinión, me decantaría por las tres siguientes distribuciones de Hadoop: Cloudera , Hortonworks and MapR. Obviamente , no podemos ignorar a otros importantes actores cómo son IBMTeradataOracleMicrosoftAmazon, entre otros, que incluyen Hadoop en sus pilas de software.

Recomiendo las distribuciones de estas tres empresas principalmente porque todos ellas ofrecen versiones gratuitas (cada uno tiene algún nivel de restricción) . Además cada uno de los proveedores ofrece imágenes de VM con Linux y Hadoop ya instalados que hace que el proceso de instalación sea mucho más fácil.

Sin embargo esta situación puede cambiar muy rápidamente debido al dinamismo de fusiones y adquisiciones de empresas en este campo.

Espero que esta información les sea de utilidad. ¡Buena suerte!

Conferència: Big Data realitats i reptes

Published on 28/03/2014

banner-CISAquests setmana m’han convidat a les XII Jornades Fòrum Català d’Informació i Salut (programa) per fer la conferència inaugural al CosmoCaixa. Gràcies a la junta de Fòrum CIS per convidar-me, va ser molt enriquidor també per a mi la participació.

Com sempre que puc deixo a l’abast de tothom les transparències per si són del seu interés. Les transparències les poden trobar a slideshare: http://www.slideshare.net/jorditorres/big-datarealitatsreptes. El video de la presentació el poden trobar a YouTube editat per l’organització.

Big Data: from the Cloud to the earth

Published on 28/03/2014


La próxima semana, dentro de los seminarios de la asignatura CLC-MIRI (uno de los cursos que imparto este cuatrimestre ) en la Facultad de Informática de Barcelona (UPC) , tenemos el placer de tener con nosotros a  José Alejandro Cordero Rama que vendrá a compartir su experiencia y conocimientos avanzados en Big Data con mis alumnos.  Cómo de costumbre, los seminarios de mis asignaturas son abiertos a toda la comunidad académica de la UPC y comunidad científica relacionada con el Barcelona Supercomputing Center donde investigamos en estos temas. Los detalles de la conferencia a continuación. Os espero a todos!


Title “Big Data: from the Cloud to the earth”
Speaker: José Alejandro Cordero Rama

Day: Tuesday 01.04.2014
Time: 12:00 – 13:30
Room: Aulari A5  sala A5106 UPC Campus Nord


Abstract: “I always see amazing infographics and almost Sci-fi videos when I hear about Big Data projects, but I don’t see this amazing things surrounding me… why might it be?” In this presentation we will explain some examples of real Big Data projects which are being developed with the participation of Barcelona Digital. We will focus the presentation on the power of Big Data to solve the proposed projects and how this is translated to a real system. Using this difference between concept and implementation as an excuse, we will discuss some of the typical problems Data Scientists face when they deal with real Big Data problems such as lack of data, heterogenity of the processed data, the validation process, security and privacity”

CV: “He works designing and implementing statistical models for different types of problems. Nowadays, the two main topics he is working on are modelling of user behavior at home by analyzing data from sensors and the design of recommendation systems. He holds a Master in Research (MERIT master, UPC) and previously studied Computer Science (FIB,UPC) and Telecommunications Engineering (ETSETB,UPC). Before joining Barcelona Digital he worked for Gem-Med (Barcelona, a company focused on biomedical signal analysis) and worked on research (PVEU, UPC and National Institute of Informatics, Tokyo).”

Seminario sobre Big Data y Cloud en la UPC

Published on 17/03/2014

0183d1aLa próxima semana, dentro de los seminarios de la asignatura CLC-MIRI (uno de los cursos que imparto este cuatrimestre ) en la Facultad de Informática de Barcelona (UPC) , tenemos el placer de tener con nosotros nuevamente a Tiago Henriques, Arquitecto de Soluciones de la empresa Amazon Web Services, que vendrá a compartir su experiencia y conocimientos avanzados en Cloud Computing y Big Data con mis alumnos.  Cómo de costumbre, los seminarios de mis asignaturas son abiertos a toda la comunidad académica de la UPC y comunidad científica relacionada con el Barcelona Supercomputing Center donde investigamos en estos temas. Incluso esta vez nos acompañarán estudiantes de bachillerato que aprovecharan para visitar la UPC (un verdadero lujo para todos nosotros su presencia). Si alguien que no sea de la UPC o BSC quisiera atender a la conferencia o conocer a Tiago agradecería que me lo comunicara antes por cuestiones logísticas a través del email (torres@ac.upc.edu).   Los detalles de la conferencia a continuación. Os espero a todos!

(transparencias de la presentación)

Title “Big Data with AWS: Tools and Customers Uses Cases”
Speaker: Tiago Henriques

Day: Tuesday 25.03.2014
Time: 12:00 – 13:45
Room: Aulari A6  Aula A6001-Amfiteatre. UPC Campus Nord


Un detalle, la conferencia será en inglés puesto que el master es internacional con alumnos de todo el mundo. 

Abstract: “In this presentation we will cover different use cases of Big Data customers on top of Amazon Web Services (AWS) platform. We will get into detail on several technical solutions that different companies leverage to implement Big Data processes, such as scientific research, web analytics, sentiment analysis or recommendations. It includes processing data in “batch” vs real-time or using more agile data warehouse solutions.”

CV: “Tiago Henriques is a Solutions Architect at Amazon Web Services. He spends most of his time working directly with customer from Enterprises to Startups, focused on technical solutions and system architectures using AWS cloud. He holds a Post Degree in Production Management (ESCAC, Barcelona) and previously studied Computer Science (IST/UTL and FCUL/UNL). Before joining AWS he worked for Bestiario (Barcelona) a company focused on Interactive Data Visualization. In the past he also delivered classes in several institutions (IAAC in Barcelona, Master in New Media in Fine Arts University of Lisbon)”

Transparencias sobre el Marenostrum 3

Published on 10/03/2014

Marenostrum 3Esta semana he hecho una visita al supercomputador Marenostrum 3 con los estudiantes de uno de los cursos que imparto este cuatrimestre (que actualmente ocupa el lugar 34 del ranquin mundial con 1 Petaflop/s , siendo el más potente de España y el número once de Europa).

Cómo se renovó recientemente el supercomputador he tenido que preparar nuevamente mis notas con los datos principales. A veces algunos me piden estos datos, que sinceramente no me los sé de memoria, y por ello he pensado que estaría bien compartir las transparencias que he usado en el curso (que son básicamente las de Javier Bartolomé, BSC system head).

Un Cloud energéticamente no dependientes sin emisiones de CO2

Published on 17/02/2014

irec_image_12La búsqueda de un Cloud energéticamente no dependientes sin emisiones CO2 es en lo que nuestro grupo de investigación en el Barcelona Supercomputing Center – Centro Nacional de Supercomputación (BSC-CNS) investiga, en el marco del nuevo proyecto RenewIT, la prueba de concepto de data centers energéticamente no dependientes, alimentados con fuentes autóctonas y con emisiones de CO2 cero. El futuro del Cloud pasa por estas premisas si queremos que sea sostenible el crecimiento que estamos experimentando, tal como ya apuntaba en anteriores posts.

El uso de la energía eólica, solar y de biomasa (entre otras), todas autóctonas y excedentarias, libera a los centros de proceso de datos (CPDs) de la dependencia energética externa. Sin embargo, uno de los retos en el uso de este tipo de energía es la fluctuación en función del día, la hora o la estación. El BSC-CNS desarrollará algoritmos para decidir la ubicación óptima de las cargas de trabajo entre CPDs y la temporalización de su ejecución para realizar el máximo de trabajo con el mínimo de emisiones.  A nuestros colegas de investigación les decimos que buscamos “Reduce the Carbon Cost of Cloud Computing” , un tema esencial

Nuestra contribución en el proyecto consiste en proveer de un software de sistema adecuado para gestionar y redistribuir tareas a través de una red de CPDs conectados para optimizar la eficiencia energética en base a energías limpias,  y que combinando las mejoras en la eficiencia del hardware y las instalaciones, así como la utilización de energías renovables, representen un paso adelante hacía conseguir CPDs con cero emisiones contaminantes y no dependientes energéticamente de energía que proviene de otros países.

Con una financiación de 3,6 millones de euros durante tres años, RenewIT reúne a siete socios europeos, tanto centros de investigación como empresas. Liderado por el Institut de Recerca en Energia de Catalunya (IREC), además del BSC-CNS, colaboran la cooperativa especializada en consultoría energética AIGUASOL, la multinacional holandesa especialista en diseño de CPDs Deerns, la Universidad Técnica de Chemnitz (Alemania), la empresa italiana especialista en monitoreo de CPDs Loccioni Group y la empresa analista del sector de las telecomunicaciones instalada en Londres 451 Research.

Espero dentro de un tiempo poderles dar buenas noticias! Los que me conocen saben que es un tema de los que me estimula personalmente.

La nota de prensa oficial del BSC la pueden encontrar aquí.

Aeneas: Tool to support the design of data management code for Big Data applications

Published on 08/02/2014

aeneasNon-relational databases arise as a solution to solve the scalability problems of relational databases when dealing with big data applications. Since they are highly configurable, their performance is heavily affected by user decisions. To maximize performance, many different data models and queries must be analyzed, in order to then choose the best fit. This requires performing a wide range of test that can cause productivity issues.

We are glad to announce the open sourcing of Aeneas, a tool to support the design of data management code for applications using non-relational databases created at BSC. Aeneas provides an easy and fast methodology to support the decision about how to organize and retrieve data in order to improve the performance.

You can download the code from https://github.com/cugni/aeneas and  modify it, improving it and collaborate with its developing! In the repository, there is  also a short introduction of the framework and a brief “quick start” for newcomer users. We decide to release, with the stable trunk, also the unstable and under-developing one in the hope to have community feedback.

Conference: Big Data and Science

Published on 07/02/2014

RoMoLYesterday I gave the conference “Talking about Big Data & Science” at RoMoL Seminars.  Data is now considered the Fourth Paradigm in Science.  What are the important open issues in the area of Big Data? You can download the complete presentation (74 slides) from this link.

Conferència: «Els reptes del big data en bioinformàtica»

Published on 03/02/2014

IECPOST UPDATE: “Big Data Challenges in Bioinformatics” Transparències – Conference Slides

El proper 12 de febrer m’han convidat a fer una conferència a l’Institut d’Estudis Catalans organitzat per la Bioinformatics Barcelona.  El títol de la conferència és «Els reptes del big data en bioinformàtica» en la que presentaré una visió global i els reptes actuals del que s’ha començat a anomenar big data, una nova àrea d’investigació que espera aportar solucions a les noves necessitats que s’estan plantejant al món de la ciència i en especial al món de la bioinformàtica. En aquests dos enllaços, (enllaç a pàgina BSC i enllaç a pàgina IEC), podeu trobar més informació.

Data: Dimecres 12 de febrer de 2014
Horari: de 17 a 19h aproximadament.
Lloc: Sala Prat de la Riba de l’Institut d’Estudis Catalans. C/ del Carme 47, Barcelona.
Assistència oberta. Aforament aproximat de 150 persones.


Conferencia Club Marketing de Barcelona: Big Data?

Published on 31/01/2014

Este mes he dado una conferencia para el Club de Marketing de Barcelona en el que repasábamos los principales retos del Big Data y que podría suponer para su sector. En este link pueden encontrar la presentación que usé por si es de su interés. Acabamos con un debate muy interesante.

International PHD Programme Fellowships BSC-La Caixa

Published on 26/01/2014

2014-lacaixaUnder a collaborative Framework Agreement, the Ministry of Economy and Competitiveness and the “la Caixa” Foundation continue the fellowship programme started last year with a 2nd call. This programme aims to help the recruitment of talented students from across the world by doing their doctoral thesis work in one of the accredited “Severo Ochoa” centres of excellence. The objective of this joint activity is to boost the research capacity of the best research institutions in Spain. For this year, the “la Caixa” Foundation has selected the Barcelona Supercomputing Center (BSC-CNS) to offer four grants more for the academic year 2014-2015 addressed to PhD students. This grant is renewable up to four years.

BSC-CNS is looking for young scientists from the national and international community who wish to do their PhD in a stimulating environment, rich in technological resources and campus life. We encourage applications from highly motivated engineers and computer scientists with outstanding qualifications. Successful candidates will join research groups with top-level scientists and will carry out their research in cutting-edge areas of Computer Sciences, Life Sciences, Earth Sciences and Computer Applications in Sciences and Engineering.

Applicants should indicate up to two research programmes in which they would like to work, in order of preference. Moreover, if candidates have interest in a our research group (Autonomic Systems and eBusiness Platforms), they should indicate it in the motivation letter. In this case you can contact with me at jordi.torres@bsc.es before to summit your application.

Conditions and benefits

The training programme will last 4 years to complete the PhD thesis. The grant will be renewable on a yearly basis and will last 2 + 2 years. Therefore the first two years will be covered by a fellowship, after which this initial period will be evaluated for renewal for a maximum of two more years through an employment contract.

The grant will be 18,069 € gross per year during the first two years and 26,700 € gross per year during the following 2-year employment contract, will include the social security systems contributions during the 4 years.

BSC-”la Caixa” fellows will benefit from the Training Programme and BSC staff benefits:

  • International multidisciplinary scientific environment.
  • Advanced research training
  • Advanced computational facilities

Additional funds

The awardees of “la Caixa” grants will receive additional funding of 1500€ per year during the grant period and 1700€ per year during the contract period. This funding is assigned to cover the PhD tuition fees, in addition to expenses derived from congress attendance, training sessions or any other activity related to the scientific or academic activity of the awarded.

Please find all the information here

How to Install a Temperature & Humidity Sensor on your Raspberry Pi – A Step by Step Guide

Published on 10/01/2014

SENSORDuring last Christmas Holidays I learned how to build from scratch a Temperature & Humidity Sensor based in a Raspberry Pi.  It was an important experience for me that give me the opportunity to complete my end-to-end view of the implementation of what we know as Internet of Services implemented in the project COMPOSE, one of our  European project that the Barcelona Supercomputing Center participates. COMPOSE will provide an open and scalable marketplace infrastructure, in which objects (like our sensor) are associated to services that can be combined, managed, and integrated in a standardised way to easily and quickly build innovative applications. Is an example of the convergence of the Internet of Services (IoS) with the Internet of Things (IoT).

This step by step guide to install a temperature&Humidity sensor on your Raspberry Pi (Revision 2 type) is  written by Bernat Torres and myself. It was an amazing experience for us!.  Also the prototype build in this hands-on is part of the scientific project of the secondary school in our town. It is an excellent example for the students to show the diverse areas that comprise the current technological scenario. This tutorial is clearly a taste of various current technologies in a holistic way: Electronics, Hardware, System Software, Applications, Programming Language, Internet of Services, etc.

We didn’t found any tutorial that could be followed by a secondary school students in order to build a sensor prototype like this. For this reason Bernat and I think that it will be useful for other students to summarise our knowledge and share this hands-on. We hope that you enjoy it!  A Step by Step Guide LINK 

Cloud & Big Data: Not just for big business anymore

Published on 15/12/2013

TechTalentCenter Esta semana he tenido el honor de impartir una conferencia en el Tech Talent Center de Barcelona,  con el permiso de hacer pública la conferencia.  En este link pueden encontrar las transparencias que usé. Espero que les interese.


Google Doctoral Fellowship in Barcelona

Published on 14/12/2013

Google Doctoral Fellowship in Barcelona (ref. BSC-Autonomic 02/2014)

The Research Group Autonomic Systems and eBusiness Platforms  at Barcelona Supercomputing Center, invites outstanding PhD candidates to apply for one full-time PhD at UPC Barcelona Tech under the Google European Doctoral Fellowship Programme. Only high quality European PhD programs of a few European technical universities have been invited to participate in the Google Fellowships programme. Concretely UPC can only  propose two candidates for this program, so I’d invite you all to look for main program requirements and characteristics at the following link.

Our research group will propose 1 PhD students to the UPC. Those of you interested to be selected by our group within this program please send to torres@ac.upc.edu and nin@ac.upc.edu  (with “position autonomic 02/2014″) the following information (all in English) by 2nd of January:  Student CV, Official transcripts of previous and current academic records, any related important information.

The PhD work will focus on massive geo-positioned streaming data processing techniques (twitter, instagram, foursquares, …) and intelligent sensors (smartphones, cameras, …) to predict  in real time  the impact (social, economic, …) of big social events (elections, olympics games, world wide level conferences, national holidays, …). This position will be in the context of smart cities, big data and sociology research areas, definitely an exciting world of multidisciplinary research.

The candidate should prove, in addition to the good academic record required to overcome the competitive selection of these fellowships, the following knowledge / skills:

  • Excellent programming skills with different languages ​​(Java, Python, …) and Linux environments – scripting languages.
  • Excellent knowledge of algorithms and parallelism.
  • Excellent knowledge of Stream mining (data mining, machine learning, …)
  • Knowledge of real time environments for Big Data (Hadoop, NoSQL databases, Spark, Storm, …).

 Pueden encontrar otraposición para PhD disponible se encuentra en este link (ref. BSC-Autonomic 01/2014)  en este link.


Beca para hacer el doctorado en nuestro grupo de investigación en Barcelona

Published on 13/12/2013

Beca de la Caixa para hacer el doctorado en nuestro grupo de investigación en Barcelona (ref. BSC-Autonomic 01/2014)

Acaba de abrirse  la convocatoria de Becas para estudios de doctorado en universidades españolas de la obra social la Caixa y nuestro grupo de investigación tiene una posición de investigador/investigadora para cursar el doctorado en nuestra universidad (que cumple con el requisito de mención de calidad requerido), para un candidato o candidata que consiga esta beca. Para optar a esta beca hace falta la nacionalidad española.

El trabajo de doctorado se centraría en técnicas de procesamiento masivo de datos en flujo geoposicionados (twitter, instagram, foursquares, …) y sensores inteligentes (smartphones, cámaras, …) para predecir en tiempo real el impacto (social, económico, …) de eventos de destacado interés social (elecciones, juegos olímpicos, congresos, fiestas nacionales, …). Esta posición se enmarcará dentro del area de las smart cities y social data scientist, sin duda un apasionante mundo de investigación multidisciplinar.

Se precisa que el candidato disponga, a parte de un buen expediente necesario para superar la competitiva selección de estas becas, los siguientes conocimientos/habilidades:

  • Excelentes habilidades de programación con diversos lenguajes (Java, Python,…) además de dominar entornos Linux y sus lenguajes de scripting.
  • Excelentes conocimientos de algorítmica y paralelismo.
  • Excelentes conocimientos de Stream mining (minería de datos, aprendizaje automático, …)
  • Conocimientos de entornos Big Data en tiempo real (Hadoop, bases de datos NoSQL, Spark, Storm, …).

De todos los interesados solo uno podrá ser nuestro candidato a estas becas. Por ello proponemos que los interesados se ponga cuanto antes en contacto con nosotros con un plazo máximo del 24 de Enero 2104, con confirmación por nuestra parte antes del 27 de enero, con el objetivo de poder posteriormente disponer de tiempo suficiente para preparar correctamente la candidatura y  a su vez los no elegidos tener tiempo para  encontrar otras oportunidades (el deadline de las becas es 24/febrero/2014).

Los interesados pueden enviar una carta de presentación justificando su interés, con un breve curriculum , expediente académico (y su posición relativa dentro de su promoción de graduados si se dispone de esta información), breve descripción de su proyecto final de carrera y todo aquello que crean relevante a las dos direcciones de  correo torres@ac.upc.edu y nin@ac.upc.edu  ( Profesor Jordi Nin) con el subject “position autonomic 01/2014″.

Información de otra beca para el doctorado en nuestro grupo se encuentra en este link : Google Doctoral Fellowship in Barcelona (ref. BSC-Autonomic 02/2014)

Conferencia de Daniel Villatoro: Tweetbeat of the City

Published on 06/12/2013

El próximo miercoles tenemos una nueva guest lecture de Daniel Villatoro en nuestra asignatura  Cloud Computing and Big Data de la FIB aprovechando que nos visita a nuestro grupo de investigación en el BSC/UPC. Daniel es doctor en Inteligencia Artificial que actualmente está desarrollando su labor de investigador en el Barcelona Digital además de ser profesor asociado en la Universidad de Barcelona. Un gran honor para todos nosotros que haya aceptado la invitación que se le hizo a través del doctor Jordi Nin recién incorporado a nuestro grupo de investigación. A continuación pueden encontrar información de la ponencia que será impartida en castellano. La presentación está abierta a toda la comunidad de la Facultat d’Informàtica de Barcelona, la FIB.

Title: The Tweetbeat of the City   (transparencias de la presentación)

Summary: Participatory platforms with geo-positioning capabilities play an important role within the smart-city paradigm. The information shared in this type of platforms can be located in a city and take the pulse of the citizens’ activity. The temporal and spatial location of spots of high activity, the mobility patterns and the existence of unforeseen bursts constitute a certain Urban Chronotype, which is altered when a city-wide event happens (e.g. a world-class Congress).

This presentation will cover a few examples of data-mining at urban level, ranging from anomaly detection algorithms (both at volume and extension), data-combination and recommender systems.

Short Bio:
Daniel Villatoro obtained his PhD in Artificial Intelligence at the IIIA-CSIC in Barcelona. His thesis was awarded with the Victor Lesser Distinguished Dissertation Award by the IFAAMAS. During his research on normative multiagent systems he has visited and actively collaborated with Prof. Sandip Sen (University of Tulsa, USA), Dr. Giulia Andrighetto (ISTC-CNR, Italy) and Prof. Michael Luck (King’s College London, UK). Moreover, he has been an active member of the community publishing at specialized journals (JAAMAS, ACM TAAS, or PLoS-One) and organizing workshops (MABS2011 or Citisen2012-13). His research interests focus on modelling human behaviour, self-policing multi-agent systems and complex systems. Currently, Daniel works as a Researcher in Citizen Sensor Networks and Smartcities at Barcelona Digital, and as and associate professor at the University of Barcelona.

Dia: Miercoles 11/12/2013

Hora: 15:00

Aula: A6 206  (Campus Nord UPC)

Python Quick Start

Published on 04/12/2013

python-logoPython is a widely used programming language (source code is now available under the GNU General Public License – GPL) started by Guido van Rossum that supports multiple programming paradigms. Is Python the most popular programming language at the moment?, also in Big Data arena? Python still lacks some features for big data analytics, but seems that it is closing the gap fast. In any case, acording the opinion of two of our seniors researcher at BSC (David Carrera and Yolanda Becerra are the two expert on Big Data in our reseach team), Python is quickly becoming as one of the most popular programming language. This hand-on will show some basic characteristics of Python to help to enter yourself in this great language. Are you interested? HANDS-ON LINK

Tags: ,
