Skimmer’s Guide for Week 7 of Functional Programming for the Object Oriented Programmer

20 May

“Pat yourself on the back! You have both used and implemented the built-in function reduce, perhaps the most dreaded of all sequence functions.”

What did we read about?

It’s our seventh week of ‘Functional Programming for the Object-Oriented Programmer‘ by Brian Manick.

We come to the end of the “Embedding and Object Oriented Language” with some exercises and a bit of wrap up.

  • Begin on Page 73 at section 6.4 Exercises

    • Exercise 1 (Page 73)
    • Exercise 2 (Page 74)
    • Exercise 3
    • Exercise 4 (Page 75)
    • Exercise 5
    • Exercise 6
  • Finish at Page 77 at the end of chapter 6

Screenshot_14_05_2013_12_56

What stood out?

  • Anybody playing the functional tutorial drinking game can take two drinks: one for recursion and the other for factorial ;-)
  • The answer for each exercise feeds into the next, building up the complexity to use increasingly sophisticated forms of recursion.
  • The exercises conclude by introducing the powerful reduce function.

If you read nothing else this week…

  • Whatever you do make sure you complete exercises 1 to 6.
  • Exercise 5 is partially tricky, kudos if you complete it without hints! (note the result doesn’t have to be in-order)

Becoming Data Scientists – May’s Packt Publishing Competition

14 May

If you want to have a chance of winning one of this months books then please sign up on the Meetup page.  

At the end of May the lucky winner will get a physical copy with an ebook for the runner up.

Data Science

Why Your Next HR Hire Should Be a Data Scientist @ Blogging4Jobs.com

Why Your Next HR Hire Should Be a Data Scientist @ Blogging4Jobs.com

The role of Data Scientist is new and important.  Big Data is seen as the key area for innovation, and the Data Scientist is a key role in putting Big Data to work.

So who are the Data Scientists, what do they do and does it include skills that we as developers may want to acquire?

The Data Scientist mines data for useful insights.  It’s a role that is closely related to the Computer Scientist, the role often fulfilled by developers like us.

Last month we looked at the tools used by the Computer Scientist, this month we look at the skills and tools needed by the Data Scientist.

The Data Scientist Role

In his Book “Data Visualisation – a Successful Design Process” Andy Kirk identifies the “Eight Hats” of data visualisation design:

  • The Initiator – The leader who is seeks a solution.
  • The Data Scientist – The data miner, wearing a miners hat, discovering nuggets of insight buried deep within the numbers.
  • The Journalist – the story teller who refines the insight with narrative and context.
  • The Computer Scientist – The person who breaths life into the project with their breadth of software and programming literacy.
  • The Designer – With an eye for visual detail and a flair for innovation they work with the computer scientist to ensure harmony between form and function.
  • The Cognitive Scientist – Brings an understanding of visual perception, colour theories and human-computer interaction to inform the design process.
  • The Communicator – The negotiator and presenter who acts as the client-customer-designer gateway.
  • The Project Manager – The co-ordinator who picks up the unpopular duties and makes sure that the project is cohesive, on time and on message.

These are hats, and we will probably find ourselves wearing several of them over time. As you can see, Data Visualisation requires us to pull together a range of disciplines in order to achieve something meaningful.

Last month we focused on the skills of the Computer Scientist, looking at the skills needed to pull the data out of the repository and put it in front of the audience.

Miner Willy

Data Scientists are Data Miners

This month we are looking at the skills of the Data Scientist. Here’s Kirk’s full description:

The data scientist is characterized as the data miner, wearing the miner’s hat. They are responsible for sourcing, acquiring, handling, and preparing the data. This means demonstrating the technical skills to work with data sets large and small and of many different types. Once acquired, the data scientist is responsible for examining and preparing the data. In this proposed skill set model, it is the data scientist who will hold the key statistical and mathematical knowledge and they will apply this to undertake exploratory visual analysis to learn about the patterns, relationships, and descriptive properties of the data.

From Chapter 2 of Data Visualisation – A Successful Design Process

Last month we talked about data being the new soil.  The data scientist is a miner who digs down deep.  It is a pivotal roll in the design process. Kirk elaborates further:

If we don’t have the data we want, or the data we do have doesn’t tell us what we hoped it would, or the findings we unearth aren’t as interesting as we wish them to be there is nothing we can (legitimately) do about it. That is an important factor to remember. No amount of 3D-snazzy-cool-fancy-design dust sprinkled on to a project can change that.

An incomplete, error strewn or just plain dull dataset will simply contaminate your visualization with the same properties. So, the primary duty for us now is to avoid this happening, remove all guessing and hoping, and just get on with the task of acquiring our data and immerse ourselves into it to learn about its condition, its characteristics, and the potential stories it contains.

From Chapter 3 of Data Visualisation – A Successful Design Process

This month we’re going to look at some of the tools we can use as Data Scientists to immerse ourselves in the data. Tools that will help us to interact with our data, drill down into it’s seams and discover what nuggets lie within.

If you want to have a chance of winning one of this months books then please sign up on the Meetup page. At the end of May the lucky winner will get to choose a physical copy and the runner up can select an ebook.

Data Visualisation

Data Visualisation: a successful design process

For a second month we are going to look at Andy Kirk’s “Data Visualisation – a Successful Design Process.” It’s a great introduction to using Data Visualisation in your applications and the key text behind this series of competitions.

Kirk provides us with a structured approach to what can appear like a dark art. The task of data familiarisation, for example, is organised into the following steps:

  1. Acquisition – Getting hold of the data.
  2. Examination – Assessing the data’s completeness and fitness.
  3. Data Types – Understanding the properties of the raw material. (Not to be confused with Data Types in our code.)
  4. Transforming for Quality – Tidying and cleaning, filling in the gaps.
  5. Transforming for Analysis – Preparing and refining for final use.
  6. Consolidating – Bringing it all together, mashing it up with other sources.

From Chapter 3 of Data Visualisation – A Successful Design Process

Learning Highcharts for Javascript Data Visualisation

Highcharts allows the creation of sophisticated, interactive visualisations.

Let’s start at the ending: the consolidation of data to create an effective visualisation, like the one above.

Last month we looked at using HTML5 directly for producing our data visualisations. This month we’re going to look at Highcharts, a Javascript library built on top of HTML5 to provide stunning interactive charts with a lot less effort. It’s free for non-commercial use.

Highcharts provides interactivity, allowing the user’s to become data scientists and explore the data for themselves.

This is an easy approach for simple data mining needs, but the real value is in mining the rich seams of complex data through Data Analysis.  For this some powerful tools are needed.

Data Analysis

Data Analysis Cookbook

Chapter 5 talks about distributed processing with Hadoop

Anybody looking for a ‘real world’ use of Clojure should take a look at the Incanter libraries and the practical value they provide in the first phases of Data Science.  This is a Clojure cookbook that is full of solid, practical recipes for dealing with large datasets.  It shows you how to go beyond spreadsheets to deal with data on new scales of size and complexity.

The book is particularly strong on recipes for acquisition and transformation for quality and analysis.  The first chapter will show you how to pull in your data from a whole range of data sources, including JSON, XML, CSV, JDBC and Excel.  The second chapter will show you how to clean up your data with tools like regular expressions, synonym maps, custom data type parsers and the Valip validation library.

Eric Rochester’s Cookbook provides sound, practical recipes.  If you want to practical introduction to Data Analysis that will get you up, running and productive quickly then this is the place to start.

It also touches on a whole range of other related topics, such parallel programming, distributed processing and machine learning.

It isn’t so strong on the theoretical side of data analysis.  There’s a whistlestop tour of linear and non-linear relationships, Bayesian distributions and Bneford’s law in chapter 7.  Chapter 9 introduces Weka for machine learning.  In between chapter 8 shows you how to interface with Mathematica or R.

Statistical Analysis with R – Beginners Guide

A grouping of several plots displayed in the graphic window

If you want to learn more about the theory of data analysis you may want to consider working with R directly. R is the lingua franca of statistics and learning it will give you access to a wealth of resources available on the web.

In this Beginners guide the authors R John and M. Quick will show you how to get up and running with R.  The material is more abstract, with talk of standard deviations, linear models and ANOVA.  However, the authors make it more entertaining with a bit of role play.  Your are the lead strategist for a kingdom who must gather your intelligence, prepare the battle plans and brief the emperor and his generals.

When it comes to learning the mathematical theory the book doesn’t go much deeper than the Data Analysis Cookbook.  However, it does present the information in an entertaining way and by learning R you open the door to working directly with a tool used by mathematicians rather than programmers.

Big Data

Statistical analysis has been around for a long time, but it is now being performed with more data than ever before.  Companies like Google and Facebook are now working with data on an unprecedented scale and that is why there is so much buzz about Big Data.

If you want to work with Big Data, processing massive data sets measured in the terabytes, then the essential tool to learn is Map Reduce.

MapReduce Cookbook

 

More advanced MapReduce scenarios are described.

The authors are well qualified.  Srinath Perera is a Senior Software Architect at WSO2 and has a Ph.D.  Thilina Gunarathne is a Ph.D. candidate at the School of Informatics and Computing of Indiana University.  The have provided 90 recipes, presented in a simple and straightforward manner, with step-by-step instructions and real world examples.  These recipes guide you through the complex business of getting Hadoop up and running and then not only demonstrate what MapReduce is but how it can be applied to problems such as analytics, indexing, searching, classification and text processing on a massive scale.  Along the way you will be exposed to the tools and techniques that are fundamental to working with big data.

Infinispan Data Grid Platform

While Hadoop is implemented in Java, and offers a Java API it doesn’t reallly sit within the Java ecosystem.  Using Hadoop requires the learning of a whole new eco system.  To use it properly you’ll need to get to know complementary apache projects such as HBase, Hive, and Pig.

If you want to get to be able to experiment with MapReduce and distributed computing while staying firmly within the Java ecosystem then consider the Infispan data grid platform.  Installing Infinispan is as easy as installing jBoss AS7 and you can use it to provide persistance for your standard CDI applications without alteration.  The authors are Java people.  Francesco Marchioni has written several books on the JBoss application server and Manik Surtani is the specification lead of JSR 347 (Data Grids for the Java Platform).

The book offers practical guidence to get you up and running with Infinispan platform.  While none of them deal with MapReduce, they will leave you well equipped to follow the online documentation.

Skimmer’s Guide for Week 6 of Functional Programming for the Object Oriented Programmer

13 May

“I picked the rather odd name method-cache to give me an excuse to point out that this implementation isn’t completely silly.”

What did we read about?

This is our sixth week reading Functional Programming for the Object Orientated Programmer.

  • Begin on page 56 at the start of the chapter, read until page 73,  leaving exercises for next week.

This week we finally start learning about Recursion and how it can be used to implement Inheritance in our object system. We touch on using assert for pre-conditions and define a superclass “Anything”. In a rather round-about fashion, we look at how we can do method overriding.

Although rather slow and slightly patronising at times, the book covers different patterns for recursion and visits the problem of tail recursion.

What stood out?

  • The use of a method-cache to do method overriding is very silly but somewhat interesting.

If you read nothing else this week…

  • Try to get to grips with all the types of Recursion, this will stand you in good stead to do the exercises next week.

Skimmer’s Guide for Week 5 of Functional Programming for the Object Oriented Programmer

5 May

What did we read about?

This is our fifth week reading Functional Programming for the Object Orientated Programmer.

  • Begin on page 54, doing all five exercises from the last chapter “Moving the Class Out of the Constructor”.

Very straight forward this week, we are using all the knowledge we have been building up to complete these fairly straightforward exercises. A good chance for people to catch up as well!

What stood out?

  • The first exercise is excellent for practicing refactoring Clojure functions and making them more manageable through helper functions.

If you read nothing else this week…

  • Exercise 4 introduces a neat trick which isn’t intuitive, kudos if you complete it without hints!

Presenting For Geeks Give Away

3 May

“A presentation is not about the content or about you – it’s about the audience.”

We are continuing our partnership with developer.press, a new kind of book publisher that is changing the way books for software developers are produced.

They specialise in digital books, written by leading experts from across the software development ecosystem. The short format books they produce give you access to key technical know-how for less than the price of a cup of coffee!

This time around we are giving away the wonderful and concise ”Presenting for Geeks” by Dirk Haun (winners will be able to request the e-format). Reassuringly well presented, he takes us on a journey on how to convey a memorable message.

Presenting for Geeks by Dirk Haun

TO WIN: Simply jump on our Google+ community and post a comment in the corresponding post! (e.g. “I want to win Presenting for Geeks”)

2 lucky winners will then be picked on Friday 17th May 2013!

We’re really like this book (catch the Skimmer’s Guide here) and excited for future give-aways with developer.press! Watch this space and check out more from this fantastic new publisher on FacebookGoogle+Twitter and their blog: http://developerpressebooks.wordpress.com/

Skimmer’s Guide for Week 4 of Functional Programming for the Object Oriented Programmer

1 May

“This reality is usefully obscured by the language so that programmers can, without thinking, do wonderful things, blissfully pretending that the pictures in their head are what the computer is really doing.”

What did we read about?

This is our fourth week reading Functional Programming for the Object Orientated Programmer.

  • Begin on page 47 (“All the Class in a Constructor”) and read until page 54, leaving the exercises for next week.

The bare believable object from week three is developed further by moving the Class out of the constructor and introducing object instantiation and message dispatch.

What stood out?

  • The Let special form was introduced.

If you read nothing else this week…

  • Make sure you read about the Let special form.

Skimmers Guide for Week 3 of Functional Programming for the Object-Oriented Programmer

25 Apr

“Given a large enough dos elf magic mushrooms, we can hallucinate that the class-of, x and y callables are instance methods of Point”

What did we read about?

This is our third week reading Functional Programming for the Object Orientated Programmer.

  • Begin “I Embedding an Object-Oriented Language”, read until p. 46 (“All the Class in a Constructor”)

We delve into writing a pseudo object-orientated language within Clojure. Firstly we cover the semantic/definitions of everything we will be dealing with (i.e. objects, instances). Whilst a bit verbose, it helps to establish a common ground for which we can all work off.

With all that out of the way, we dive into implementing a basic geometry. We gradually build up Point objects, adding methods until we have reached Triangles!

As always, the exercises at the end serve as excellent tests of your understanding on the topic. Although the material covered may not be vast, it is important to practice programming functionally to make the next part of the book easier to grasp.

What stood out?

  • The introduction explains this chapter well as a section focussing on giving us the “opportunity to practice programming in a functional language”. This experience will “make the topics in the next part of the book easier to grasp”.
  • Exercise 3 & 4 are actually fairly involved and challenge you to think in the functional way.

If you read nothing else this week…

  • Jump straight to “A Barely Believable Object” to get into the implementation details!

Skimmers Guide for Week 2 of Functional Programming for the Object-Oriented Programmer

14 Apr

“How do you write loops in Clojure? You don’t (mostly).”

What did we read about?

This is our second week reading Functional Programming for the Object Orientated Programmer.

  • Finish Chapter 1 – from 1.11 Vectors

After discussing vectors and how they differ to lists we finally discuss what we might consider the basic elements of a programming language: loops, conditionals and different methods for passing parameters.  The chapter concludes with four pages of exercises.

In a language like Java we are usually introduced to the control flow first and the data structures come later.  Here the reverse approach is taken: with a discussion on data structures coming first and the control flow following.  The chapter is then disparaging about both conditionals and loops.

What stood out?

  • In structured programming there are three basic constructs: sequence, selection and repetition.  In functional programming sequence is achieved using lists.  Conditionals are introduced with a reference to the Anti-IF Campaign and regarding repetition we are told that we dont write loops (mostly).  This book is forcing us to think about programming in a new way.
  • Rather than walking through the common functions the reader is given an exercise (5) with a list and told to think of a problem it could solve and then solve it.  The reader is not being spoon fed.

If you read nothing else this week…

  • Work through the exercises in section 1.18.

Further research

  • If you’re new to Clojure then I would strongly recommend Clojure Made Simple.  It will provide you with many essential tips that will help you stay sane.  In section 2.8, for example, you’ll find out how to access the built in docs are referred to in exercise 5.  Type “(doc take)” at the  REPL and you’ll quickly discover that it “returns a lazy sequence of the first n items in coll,”  That will definitely help you keep your sanity.
  • Take a look at the Anti-IF Campaign.
  • The classic book Thinking Forth has an excellent chapter on “Minimizing Control Structures:”  “The use of control structures adds complexity to your code.  The more complex your code is the harder it will be for you to read and maintain. The more parts a machine has, the greater are its chances of breaking down.  And the harder it is for someone to fix.”  The principles explained are good for any language.

Beautiful Data – April’s Packt Publishing Competition

9 Apr

If you want to have a chance of winning one of this months books then please sign up on the Meetup page.  

At the end of April the lucky winner will get a physical copy with an ebook for the runner up.

Data is the New Soil

Nightingale’s Rose

It has been said that “Data is the New Oil.”  In his excellent Ted Talk David McCandless tells us that “Data is the New Soil“:

It feels like we’re all suffering from information overload, or data glut. And the good news is there might be an easy solution to that, and that’s using our eyes more. So visualizing information, so that we can see the patterns and connections that matter, and then designing that information so it makes more sense, or it tells a story, or allows us to focus only on the information that’s important…. Visualizing information like this is a form of knowledge compression. It’s a way of squeezing an enormous amount of information and understanding into a small space.

http://www.cjr.org/the_news_frontier/data_is_the_new_soil.php?page=all

There’s a lot of value in creating something that takes the information overload and transform it into a form that makes the story it tells clear.

An inspiring example of this was Florence Nightingale’s Rose Diagram, pictured above.  The lady was able to analyse and present data.  The lady of the lamp’s use of data visualisation saved countless lives:

After the war, Nightingale wrote a passionate report on why the soldiers had died in such large numbers and it revealed the astonishing fact that out of 18,000 deaths, 16,000 had been due to infectious diseases in hospital rather than battle wounds. The report included her revolutionary and controversial ‘Rose Diagram’, whose message was potent and direct – hospitals can kill. The diagram was designed to persuade the British government that, if sanitation in hospitals was improved, many deaths could be avoided. Nightingale’s pioneering diagram was a catalyst in the creation of better and cleaner hospitals that would go on to save thousands of lives.

http://www.bbc.co.uk/programmes/b00wgqlq

How can we as developers use data visualisation to make a difference?  This month we shall look at books that show us how to design, display, store, access and secure data.  With these tools in hand your are ready to inspire with the beauty of data.

If you want to have a chance of winning one of this months books then please sign up on the Meetup page.  At the end of April the lucky winner will get a physical copy with an ebook for the runner up.

Visualising Data

Data Visualization: a successful design process

Andy Kirk is the author of the Visualising Data blog.  There you will find him talking about powerful visualisations like this:

Iraq’s bloody toll

His book offers a handy strategy guide to help you approach your data visualization work with greater know-how and increased confidence. It is a practical book structured around a proven methodology that will equip you with the knowledge, skills, and resources required to make sense of data, to find stories, and to tell stories from your data.

HTML5 Graphing and Data Visualization Cookbook

We are developers not graphic artists, so when it comes to telling the story from our data we won’t be reaching for our pencils.  Instead we will look to the graphs and charts provided by our user interface framework.  The widest possible audience is available using the standard components provided by HTML5.

Ben Fhala has developed applications for governments and companies and directed many award-winning projects,  He was worked on teams that have won three Agency of the Year awards.  In this cookbook he shares recipes for bringing static data to life.

Data Highs

Data visualisation applications are hungry for fast fast and reliable storage.  Here are two possible approaches: with and without sql.

High Availability MySQL Cookbook

Alex Davies covers all the major techniques available for achieving high availability for MySQL, including clustering, replication, shared storage and block level replicaiton.

Cassandra High Performance Cookbook 

Traffic Monitoring benefits from data visualisation

Edward Capriolo‘s recipes include how to access data stored in Cassandra and use third party tools to help you out. He also describes how to maintain high levels of performance through monitoring and capacity planning.

Data through the Middle

You have data in your repository feeding graphics in your user interface.  Sitting in-between is the middleware that makes the data available to the clients that need it while keeping out those who shouldn’t see it.  Here are two books that show how Spring can make working with that middleware easier.

Spring Data

Spring Data

Petri Kainulainen‘s book shows how JPA repositories can be implemented with less code. Sample project demonstrate the concepts in action.

Spring Security

User Management

Robert Winch and Peter Mularien use a simple Spring Web MVC-based application to illustrate how to solve real-world problems.

Skimmers Guide for Week 1 of Functional Programming for the Object-Oriented Programmer

7 Apr

“hold off on thinking that programming without an assignment statement has to be crazy hard–it’s part of this book’s job to show you it’s not.”

What did we read about?

This week we started our second book, Functional Programming for the Object Orientated Programmer.

  • Chapter 1 – Up to 1.10 Lists

After the rapid pace of our last book, we are taking things slow and steady as we get to grips with Clojure and functional programming. As you would expect, the opening chapter of the book introduces Clojure and the options you have for setting it up on your machine (IDE or command line tools like Leningen).

Following this the author covers all the basics, including defining functions and handling the list data structure. The exercises at the end are very basic but help to ensure you have a good grip of everything that was covered so far.

The Lazy REPL Bird

In start contrast to the approach taken in the “Well Grounded” chapter this short introduction focuses more on how the REPL is implemented than how it is used.  It talks about evaluator is anthropomorphised as a lazy little bird eating up parentheses, tokens and symbols.  There are many pictures but it doesn’t seem to cover a lot of ground.  The exercises seem rather simplistic – returning the second and third elements in a list.  The introduction is focusing on how the functional code’s execution style is different to the object oriented apporach the reader is used to.

What stood out?

  • The Clojure REPL (Read-Eval-Print Loop) walkthrough, including an easy explanation on how it actually works.
  • Thinking of functions as ‘values’ and embracing pure immutability.

If you read nothing else this week…

  • Check out Light Table which has an excellent ‘Instarepl’ feature which is great for beginners and experts alike (neat video demo here)
  • If you already know the functional basics you can skim most of this weeks content.  It’s just a simple introduction to the REPL.
  • If you’re new to functional languages then you need to work through this text carefully.  The ideas being introduced are simple but fundamental to understanding the rest of the book.

Further research

  • If you need to take more time learning the Clojure Language then check out Clojure Made Simple.  It was written by our very own John Stevenson.  He knows a thing or two about real world Clojure.  John’s approach to introducing Clojure is far more practical than this weeks chapter.
  • Rich Hickly, the creator of Clojure, has an excellent presentation on InfoQ called “Simple Made Easy.”  In it he talks about the design philosophy behind the language.
Follow

Get every new post delivered to your Inbox.