Thursday, November 24, 2016

Converting webpage to epub file and transfer to Kobo E-reader easily

I like;
- Reading Wikipedia pages, especially regarding the history of nations.
- Reading from my e-reader.
- Reading at times when my laptop is shutdown, f.e. in bed.
- Being able to switch topic spontaneously.

I tried to find a solution for this.

First I tried the web browser on my Kobo e-reader (in beta). Unfortunately, this is very annoying, for a number of reasons. Later I found the website https://dotepub.com/ which allows you to convert a webpage to an epub file online. Unfortunately, this does not work on the e-reader web browser.

For this reason, I figured I will use my iPhone to convert a web page to an epub file. The next challenge would be to transfer the epub file from my Iphone to my e-reader. I tried using google drive, but this was messy in the e-reader web browser. So I need to use SD media. Since the Iphone has no SD slot, wireless transfer is needed. Therefore I bought the Verbatin Mediashare Wireless, which has an SD slot and Wifi capabilities, it also has an iPhone app. Unfortunately, it's impossible to transfer non photo or video media from the Iphone to the Verbatin Mediashare using the app! What a disappointment! I could have known since there a lot of negative reviews on the web with people complaining. But there is good news!

By coincidence, I learned that the Verbabin Mediashare hosts a website which has the option to upload files. When using this upload option you are able to use regular file apps.

In the screens below I show how to use this setup. Don't hesitate to give feedback in the comments section when you have a better solution.

First nagivate to the webpage, with safari(!), that you would like to convert to epub. Then open the bookmark with the dotepub.com scripts (see webpage for instructions).



select more and choose your file management solution ( I am using Documents 5). 


Store it in your folder: 


Connect to the Verbatin Mediashare using Wifi. 

In your browser navigate to 10.10.10.254


press enter. 

and add the file, select the second icon. 



as source choose your file management app, in my case 'Documents 5'.


select the file, in this case 'history of denmark'

select ok.


The file will be stored on the SD card. Shut down the Verbatin Mediashare. And insert the SD card in your e-reader. The e-reader will recognize the new file and you can start reading Wikipedia from your e-reader!













Thursday, July 7, 2016

Raspberry Pi Weather Info

Hi,

In this post I will show how to create a weather info station within an hour. The end result looks like this:


The used components:

- A Raspberry Pi (obviously).

- Very handy casing (8, 95 euro).

- A Keyboard (general application of course), (24,95 euro)

- A TFT screen (12,95 euro)

Installing
I got the TFT screen working by performing the following steps: http://www.circuitbasics.com/setup-lcd-touchscreen-raspberry-pi/


Upload a html file with the following contents:






Upload the file to: /var/www/

Make sure to give it execute permissions!

Then boot the Pi, run the LX terminal and type in (using the keyboard):
  epiphany-browser localhost/weer.html (or whatever name you gave the html file).

btw; I still need to find a way to turn of the screen saver..

Thursday, June 23, 2016

BIML Series, Part 1, BIML Development approach

This summer I will post quite a bit about BIML. In this post I will share how one can develop BIML packages without purchasing additional software. I installed Microsoft Visual C# 2008 express edition, because it hardly occupies any disk space and is a free download. Within Visual C# is run this code:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Collections.Generic;
using System.ComponentModel;
using System.Text;
using System.Diagnostics;
using System.Windows.Forms;


public class ExtractTable
{
public string PackageName { get; set; }
public string SequenceName { get; set; }
public string DataFlowName { get; set; }
public string SchemaName { get; set; }
public string SourceSystemName { get; set; }
public string SelectQuery { get; set; }
public string SQLFlavour { get; set; }
public string ExtractTableName { get; set; }
public string SourceTableName { get; set; }
public string ArchiveProcedure { get; set; }


private static string metadataConnectionString = "Server=localhost;Database=meta;User Id=lalal; Password=dada;";


private static DataTable getData(string query)
{
DataTable dataTable = new DataTable();

using (SqlDataAdapter dataAdapter = new SqlDataAdapter(query, metadataConnectionString))
{
dataAdapter.Fill(dataTable);
}

return dataTable;
}

private static DataTable getData(string query, string parameter, string parameterValue)
{
DataTable dataTable = new DataTable();

using (SqlConnection conn = new SqlConnection(metadataConnectionString))
{
using (SqlCommand cmd = new SqlCommand(query, conn))
{
cmd.Parameters.AddWithValue(parameter, parameterValue);
SqlDataAdapter adapter = new SqlDataAdapter(cmd);

adapter.Fill(dataTable);
}
}

return dataTable;
}

private static ExtractTable fillExtractTable
(
string PackageName,
string SequenceName,
string DataFlowName,
string SchemaName,
string SourceSystemName,
string SelectQuery,
string SQLFlavour,
string ExtractTableName,
string SourceTableName,
string ArchiveProcedure
)
{
ExtractTable extractTable = new ExtractTable();

extractTable.PackageName = PackageName;
extractTable.SequenceName = SequenceName;
extractTable.DataFlowName = DataFlowName;
extractTable.SchemaName = SchemaName;
extractTable.SourceSystemName = SourceSystemName;
extractTable.SelectQuery = SelectQuery;
extractTable.SQLFlavour = SQLFlavour;
extractTable.ExtractTableName = ExtractTableName;
extractTable.SourceTableName = SourceTableName;
extractTable.ArchiveProcedure = ArchiveProcedure;

return extractTable;
}
[STAThread]
static void Main()
{
// Package objects
var extractTables = new List();

// Variables used to retrieve and write data
string query;
DataTable tblExtractTables;

// Add ExtractTables
query = "SELECT top 1 "
+ " 'Extract_Entrino_' + bet.SourceTableName AS PackageName,"
+ " bet.SequenceName,"
+ " bet.DataFlowName,"
+ " bet.SchemaName,"
+ " bet.SourceSystemName,"
+ " bet.SelectQuery,"
+ " bet.SQLFlavour,"
+ " bet.ExtractTableName,"
+ " bet.SourceTableName,"
+ " bet.ArchiveProcedure"
+ " FROM META.BimlExtractArchivePackages AS bet ";


tblExtractTables = getData(query);

foreach (DataRow row in tblExtractTables.Rows)
{
extractTables.Add(fillExtractTable(row[0].ToString(), row[1].ToString(), row[2].ToString(), row[3].ToString(), row[4].ToString(), row[5].ToString(), row[6].ToString(), row[7].ToString(), row[8].ToString(), row[9].ToString()));
}



foreach (ExtractTable extractTable in extractTables)
{
string text = System.IO.File.ReadAllText(@"C:\TFS\Temp\biml.txt");
text = text.Replace("<#=extractTable.PackageName#>", extractTable.PackageName);
text = text.Replace("<#= extractTable.SelectQuery#>", extractTable.SelectQuery);
Clipboard.SetText(text);


}
}

}

The most important thing to note about this code is the fact a predefined piece of biml code is loaded from a text file and the variables are replaced by the values coming from the meta data repository in SQL. The resulting non dynamic biml code can then be pasted in VS2008 or later with BIDS helper to generate the package. Any error messages will be a lot easier to debug. Will be continued..

Saturday, June 11, 2016

First experience with Neo4J

During the DWH and BI summit, I got inspired to look into Graph Databases. A former colleague recommended looking into Neo4J. It turned out to be quite easy to load data in Neo4J and more easily explore related data. In our case, we wanted to analyze our Meta Data Driven DWH Framework. Which had grown quite a bit and became more and more difficult to maintain. We gave the 'Force directed Graph' visualization in Power BI a try, but this was to limited for our analysis goals.

The video below shows the results.

More in Neo4J in the future!

Remote controlled car with Raspberry Pi (piborg) and HC-SR04 ultrasonic sensor (to detect walls and stop car)

I this post I will show how to build a remote controlled car with your Raspberry Pi that will stop whenever it encounters a wall at a predefined distance. The result will look like this:

The car looks like this in detail:

Components


Smart buggy 16 euro.

Two additional wheels, that I glued to the plastic board.
PIBORG - PICOBORG V2 - MOTOR CONTROLLER, 13 euro.

Triborg, 9 euro.

The ultra sonic distance sensor, HCSR04, 2,50 euro.

Wiring was done as described on the internet. The car is remote controlled via telnet using Putty.

Code

The code below is executed. This was tricky because the key listener needs to be non blocking. Once I found one that is non blocking. Its just a matter of using keys, I, J, K, L for steering the car. The distance sensor checks for distances less than 30 CM. The motor is turned off when this happens.

import sys
import select
import tty
import termios
import contextlib
import time
import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BCM)


#used by the distance meter
TRIG = 23
ECHO = 24

GPIO.setup(TRIG,GPIO.OUT)
GPIO.setup(ECHO,GPIO.IN)

#used by the piborg
# Set which GPIO pins the drive outputs are connected to
DRIVE_1 = 4
DRIVE_2 = 18
DRIVE_3 = 8
DRIVE_4 = 7

# Set all of the drive pins as output pins
GPIO.setup(DRIVE_1, GPIO.OUT)
GPIO.setup(DRIVE_2, GPIO.OUT)
GPIO.setup(DRIVE_3, GPIO.OUT)
GPIO.setup(DRIVE_4, GPIO.OUT)

# Map current on/off state to command state
dInvert = {}
dInvert[True] = GPIO.LOW
dInvert[False] = GPIO.HIGH

# Map the on/off state to nicer names for display
dName = {}
dName[True] = 'ON '
dName[False] = 'OFF'

# Function to set all drives off
def MotorOff():
GPIO.output(DRIVE_1, GPIO.LOW)
GPIO.output(DRIVE_2, GPIO.LOW)
GPIO.output(DRIVE_3, GPIO.LOW)
GPIO.output(DRIVE_4, GPIO.LOW)

def MotorOn():
GPIO.output(DRIVE_1, GPIO.HIGH)
GPIO.output(DRIVE_2, GPIO.HIGH)
GPIO.output(DRIVE_3, GPIO.HIGH)
GPIO.output(DRIVE_4, GPIO.HIGH)


def afstand():
time.sleep(0.2)
GPIO.output(TRIG, False)
GPIO.output(TRIG, True)
time.sleep(0.00001)
GPIO.output(TRIG, False)
while GPIO.input(ECHO)==0:
pulse_start = time.time()

while GPIO.input(ECHO)==1:
pulse_end = time.time()

pulse_duration = pulse_end - pulse_start
distance = pulse_duration * 17150
distance = round(distance, 2)

return distance
#distance

def isData():
return select.select([sys.stdin], [], [], 0) == ([sys.stdin], [], [])


old_settings = termios.tcgetattr(sys.stdin)
try:
tty.setcbreak(sys.stdin.fileno())


while 1:
if (afstand() < 50):
MotorOff()


if isData():
c = sys.stdin.read(1)
print str((c))
if c == '\x1b': # x1b is ESC
break
elif c == 'i': #Vooruit
MotorOn()
elif c == 'k': #Stop
MotorOff()
elif c == 'j': #Links
print 'links'
if ((GPIO.input(DRIVE_1)) & (GPIO.input(DRIVE_4))):
print 'aan het rijden'
GPIO.output(DRIVE_1, GPIO.LOW)
GPIO.output(DRIVE_2, GPIO.LOW)

time.sleep(0.1)
GPIO.output(DRIVE_1, GPIO.HIGH)
GPIO.output(DRIVE_2, GPIO.HIGH)
else:
GPIO.output(DRIVE_3, dInvert[GPIO.input(DRIVE_3)])
GPIO.output(DRIVE_4, dInvert[GPIO.input(DRIVE_4)])
elif c == 'l': #Rechts
print 'rechts'
if ((GPIO.input(DRIVE_1)) & (GPIO.input(DRIVE_4))):
print 'aan het rijden'
GPIO.output(DRIVE_4, GPIO.LOW)
GPIO.output(DRIVE_3, GPIO.LOW)

time.sleep(0.1)
GPIO.output(DRIVE_4, GPIO.HIGH)
GPIO.output(DRIVE_3, GPIO.HIGH)
else:
GPIO.output(DRIVE_1, dInvert[GPIO.input(DRIVE_1)])
GPIO.output(DRIVE_2, dInvert[GPIO.input(DRIVE_2)])
finally:
print("stop")
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_settings)

Thursday, June 9, 2016

Market Basket Analysis (Association Rule Learning) with Power BI (DAX) and R

Introduction

In this post I will show how to run an R script from Power BI which will execute an Association rule learning script to perform market basket analysis.

In this example we will not look at products sold, but products sharing shelf space.

The dataset

Our basic dataset looks like this.

 Our products:

The distribution / presence of products on the shelf of a customer:

The Power BI building blocks

The data model

As for the DAX part we will start with this post of Marco Russo and Alberto Ferrari.

So the data model in Power BI looks like this:



The R visualization

We will look at the DAX part later on. First we add an R component with a script that will return the AR rules it found.


The table contains the basic output that is to be expected from AR. We will try to build these measures in DAX later on.

The R script

As for the R script it looks like this:


   
save(dataset, file="C:/TFS/dataset.rda")

library(arules, lib.loc="C:/TFS/Rlib/a/" , logical.return = FALSE,
warn.conflicts = F, quietly = T,verbose = F)
library(plotrix, lib.loc="C:/TFS/Rlib/p/" , logical.return = FALSE,
warn.conflicts = F, quietly = T,verbose = F)

dataset = cbind(dataset, 1)
colnames(dataset) = c("ProductID", "CustomerID", "Waarde")
reports = xtabs(Waarde~CustomerID+ProductID, data=dataset)
reports[is.na(reports)] <- 0
rules <- apriori(as.matrix(as.data.frame.matrix(reports)),parameter = list(supp = 0.03, conf = 0.5, target = "rules"))
t = inspect(head(sort(rules, by ="support"),15))

par(mar = c(0,0,0,0))
plot(c(0, 0), c(0, 0))
if (is.null(t)) {
t = data.frame("no rules found")
text(x = 0.5, y = 0.5, paste("No Rules found"),
cex = 1.6, col = "black")
} else {
addtable2plot(-1, -1, t, bty = "n", display.rownames = F, hlines = F,
vlines = F)
}

Unfortunately Power BI initializes a new R sessions each time the R visualization is run / cross filtered.  Therefore I tried to use a much base R as possible. As for the libraries that need to be loaded. I put these in a separate folder on my local drive and specified the folder name in the library command.

Building it in DAX

Support

The output of the arules R script can be built in DAX whenever it concerns single item combinations, so X -> Y. So not A, B -> Y.  The 'support' measure is basically the '[Orders with Both Products %]' described by Russo and Ferrari. Just to show how its implemented on our dataset.
  
Customers with Both Products % =
IF (
NOT ( [SameProductSelection] );
DIVIDE ( [Customers with Both Products]; [Unique Customers All] )
)
The building blocks of this formula:
Same product selection, since this is useless.
  
SameProductSelection =
IF (
HASONEVALUE ( Products[ID] )
&& HASONEVALUE ( 'Filter Products'[ID] );
IF (
VALUES ( Products[ID] )
= VALUES ( 'Filter Products'[ID] );
TRUE
)
)
Customers with both products:
   
Customers with Both Products =
CALCULATE (
DISTINCTCOUNT ( Distribution[Customer ID] );
CALCULATETABLE (
SUMMARIZE ( Distribution; Distribution[Customer ID] );
ALL ( Products );
USERELATIONSHIP ( Distribution[Product ID]; 'Filter Products'[ID] )
)
)
Number of customers in total:
Unique Customers All = 
CALCULATE (
DISTINCTCOUNT ( Distribution[Customer ID] );
ALL ( Products )
)

Confidence

   
Confidence = [Customers with Both Products] / [Unique Customers LHS]
Unique Customers LHS:
   
Unique Customers LHS = DISTINCTCOUNT(Distribution[Customer ID])

Lift



Lift = [Confidence] / [Proportion Product RHS]

Proportion product RHS:

Proportion Product RHS = Distribution[Unique Customers RHS] / [Unique Customers All]

Unique customer RHS:
 
Unique Customers RHS =
CALCULATE (
DISTINCTCOUNT ( Distribution[Customer ID] );
CALCULATETABLE (
SUMMARIZE ( Distribution; Distribution[Customer ID] );
ALL ( Products );
USERELATIONSHIP ( Distribution[Product ID]; 'Filter Products'[ID] )
); ALL(Products)
)

You can download the Power BI file here.

In this video you see the Power BI file in use:

Sunday, May 29, 2016

Internet of Things, my first "thing"

After watching this very nice episode of Tegenlicht about smart cities I got inspired to start sharing my experiences with my Raspberry Pi. I started playing with the Pi about 2 years ago and tried a most of the common sensors (temp, light, sound, movement, etc). The coming period I will start sharing my experiences. The is the Tegenlicht episode on smart cities, higly recommended!



By the end of this post we are able to view the temperature on a webpage which can be viewed with your computer or any mobile device of course.

The required hardware components:

- Raspberry Pi 2 or 3.. (35 euro)

- Wifi adapter
- SD card
- DHT11 Temparature sensor (2,45 euro)
- mini breadboard (1,55 euro)
- 10 Kohm resistor (0,10 euro


Regarding the hardware purchase; I like to buy the Pi at SOS solutions because this guy makes sure the peripherals are high quality. You can get the Pi for a slightly lower price elsewere but he makes up with good service 7 days a week!

I buy all the componts at vanallesenmeer.nl. This shop offers the products at the lowest price possible and delivery has always been on time.  

Software used:
- Putty for telnet session
- FileZilla to transfer the files with FTP
- Notepad++ to edit / write the code. 

Putting it together:

This shows how I wired the DHT11:


It looks like this in real life:


The software:
We install the ADA fruit DHT library for convenience as instructed.
  
sudo apt-get update
sudo apt-get install build-essential python-dev
sudo python setup.py install

Now we can create a file with the following contents, just like the example that has been provided:

#!/usr/bin/python

import Adafruit_DHT
sensor = Adafruit_DHT.DHT11
pin = 18
humidity, temperature = Adafruit_DHT.read_retry(sensor, pin)

if humidity is not None and temperature is not None:
print 'Temp={0:0.1f}*C Humidity={1:0.1f}%'.format(temperature, humidity)
else:
print 'Failed to get reading. Try again!'

When you place this file in the /var/www/ folder next to a php file with the following contents you will be able to disply the temperature on a webpage.

<html>
<?php
echo exec('sudo python ./simpletest.py')
?>
</html>

You need to make the www-data group owner of the file:

sudo chown www-data temp.php

When you open the page in a browser the result looks like this. Mind its hot in the closet where the Pi resides..

Friday, May 27, 2016

Correlation in DAX

In this blogpost I show how to calculate the correlation in DAX. This post will be refined in the future, also to show the comparison with R. The code is shown so you see how to run it in DAX studio. We will investigate the correlation between visits and sales. We define the standard deviation for visits:
    
DEFINE
MEASURE Visits[sdtotal_visits] =
CALCULATE (
STDEVX.P ( VALUES ( 'Visit Date'[Month] ), [Completed Visits] ),
ALLSELECTED ( 'Visit Date' )
)

We define the mean for visits:
    
MEASURE Visits[meantotal_visits] =
CALCULATE (
AVERAGEX ( VALUES ( 'Visit Date'[Month] ), [Completed Visits] ),
ALLSELECTED ( 'Visit Date' )
)
Same for value
    
MEASURE Visits[sdtotal_ov] =
CALCULATE (
STDEVX.P ( VALUES ( 'Visit Date'[Month] ), [Order Value] ),
ALLSELECTED ( 'Visit Date' )
)
MEASURE Visits[meantotal_ov] =
CALCULATE (
AVERAGEX ( VALUES ( 'Visit Date'[Month] ), [Order Value] ),
ALLSELECTED ( 'Visit Date' )
)
We multiply the two standard deviations:
    
MEASURE Visits[sdsd] =
CALCULATE ( [sdtotal_visits] * [sdtotal_ov] )
We calculate the deviation from the mean for each measure:
    
MEASURE Visits[afwijking_visits] =
CALCULATE ( ( [Completed Visits] - [meantotal_visits] ) )
MEASURE Visits[afwijking_value] =
CALCULATE ( ( [Order Value] - [meantotal_ov] ) )
We calculate the average of the product of the two deviations:
    
MEASURE Visits[avgproduct] =
CALCULATE (
AVERAGEX (
VALUES ( 'Visit Date'[Month] ),
[afwijking_visits] * [afwijking_value]
),
ALLSELECTED ( 'Visit Date' )
)
We devide this value of the product of the two SD's
    
MEASURE Visits[correlation2] =
CALCULATE ( [avgproduct] / [sdsd] )
Now the query to view the results:
    
EVALUATE
CALCULATETABLE (
ADDCOLUMNS (
SUMMARIZE ( 'Visit Date', 'Visit Date'[Month] ),
"Visits", [Completed Visits],
"order value", [Order Value],
"correlation2", [correlation2]
),
'Visit Date'[Month Key] = "2015M12"
|| 'Visit Date'[Month Key] = "2016M01"
|| 'Visit Date'[Month Key] = "2016M02"
)

Tuesday, May 24, 2016

Image to data frame to ggplot heatmap

Image to DF to ggplot

In this post we will convert an image to a dataframe to display the image with ggplot. In a later post we will compress the image.

library("png", lib.loc="C:/TFS/Rlib/")
require(ggplot2)
## Loading required package: ggplot2

This is the original image:

Original.

The work

plaatje = readPNG("c:/TFS/test.png")
x <- data.frame(1:100,1,1,1,1,1)
colnames(x) = c("t", "x","y", "r", "g", "b")
for (a in 1:3) {
teller = 1
for (i in 1:length(plaatje[,1,a])) { #Hoogte
for (j in 1:length(plaatje[i,,a])) { #Breedte
x[teller,c(1:3,3+a)] = c(teller,j,i,plaatje[i,j,a])
teller = teller + 1
}
}
}


ggplot(data=x, aes(x=x, y=y, fill=rgb(r,g,b))) +
geom_tile() +
scale_fill_identity()