Cocktail recipes generated by neural nets

Inspired by AIweirdness, I used the open-source pretrained recurrent neural network textgenrnn to generate cocktail recipes. There are meal recipes generated by RNNs out there (Tom Brewe’s), but I was hoping that the relative simplicity of cocktails recipes would yield more realistic results. (Spoiler: not really)

I got a recipe database thanks to The Cocktail DB (I used API queries to get the recipes), which I completed with other recipes from The Webtender (this time using html scraping). The full dataset contains approximately 2000 recipes; this limitation is probably one of the main reasons for the low quality of my generated recipes. These recipes are user contributed, so there are some sentences in the dataset that are not strictly related to recipes, like “Works great for barbecues !”

Below is some data exploration: a graph representation of the ingredients appearing together in the recipes, using R and edgebundleR.


As for the neural network, it is incredibly simple to set up; literally 4 lines of code in Python! Here are some generated cocktails using 10 epochs and various temperatures (temperature is a kind of “creativity” parameter):

“Citrus Shot #2: 1 bottle Brandy, 1 part Tequila, 1 part Rum (Bacardi), 1 part Grenadine. Pour the sugar and sugar and sweet and stir well. Strain into glass and pour into mouth. Stir well. Drop the recipes. Fill up with granule shot.”

“Chickly Preside: 1 1/2 oz Gin, 1 oz Lime juice, 1 1/2 tsp White Creme de Menthe, 1 oz Vodka, 5 oz of specifice, 5 1/2 oz Gin, 1 whole Egg. Shake. Pour into a shot glass. Serve. ”

“Pumpard Sight: 1 1/2 tsp Grenadine, 1 tsp Cinnamon, 3 dashes Triple sec, 2 oz Sour mix, 1 splash Jägermeister. Mix all ingredients avolaxed in a shaker with ice, shaken, add the slices and coffee until the vodka. Strain and add the Irish cream and pour into shot glass. Add Bailey’s”


On the plus side, the network learnt the recipe format is “Name: Ingredient List. Instructions.” The principal problems that make these recipes unrealistic are the almost-systematic lack of connection between the cocktail name and the recipe (like the citrus-less 1st recipe above), inconsistent dosages (mixing “1 part” with “1 oz”), ingredients appearing in the instructions but not in the ingredients list, and repeated ingredients (“… 1 oz Gin, 2 oz Gin, 1 oz Gin …”). The repeating ingredients can be avoided by setting the neural net temperature to a higher value, but it was a fine line between conservative and gibberish. In the end, I did not suggest these recipes for my company Data Science team afterwork as I originally intended.


Data Science Bowl 2018 Kaggle competition

I have at last found the time to try Keras ! It is so much easier to get started and running than Tensorflow, although of course this comes at the price of some flexibility.

I’m trying my hand at the Kaggle Data Science Bowl 2018 competition, on the topic of object segmentation, which in this case mean delimiting cells in medical imagery. The training data are microscopy photographs of cells, and the label of each image is a set of masks (binary pictures) for each of the individual cells in the picture. Below are some examples of input images in the training set. Sizes and aspect ratios are varied, but are often around 256×256 pixels.

An example of an input image (training example) and its masks (labels):


As usual, the Kaggle forums are full of helpful people providing starters notebooks; I used the Keras U-Net starter as the basis for my explorations.

The neural network in this approach is a simplified version of the one described in this paper; each layer has less channels. I kept this simplification because I’m working with a Google Colaboratory notebook, and due to the notebook limitations using the same number of features as in the paper on each layer makes the training very slow, as in “surprise!notebook disconnect” sort of slow. Anyway, Colaboratory is still great, considering that it is a free tool. It’s a Jupyter-based python notebook running on shared VMs at Google, and the VM comes with free GPU ! The VM state is not persistent, so after 90 minutes of inactivity the notebook disconnects and the VM is lost. This means any packages installations or data download have to be done at every restart. Thankfully, Linux commands can be integrated in the notebook simply by starting the command with “!”.

Here are the preliminary steps I added to the original Unet Starter Notebook:

!pip install -q tqdm
!git clone -q
!pip install -q kaggle
!mkdir -p /content/.kaggle/
!mkdir -p data-science-bowl-2018/original
!mkdir -p data-science-bowl-2018/merged/stage1_train
# unsaaaaaafe
!echo '{"username":"mykagglename","key":"mykaggleapikeygoeshere"}' > /content/.kaggle/kaggle.json
!kaggle competitions download -c data-science-bowl-2018 -p data-science-bowl-2018/original/.
!unzip -q -o "data-science-bowl-2018/original/" -d "data-science-bowl-2018/original/stage1_test"
!unzip -q -o "data-science-bowl-2018/original/" -d "data-science-bowl-2018/original/stage1_train"
!cp -r data-science-bowl-2018/original/stage1_train/* data-science-bowl-2018/merged/stage1_train/.
!cp -r kaggle-dsbowl-2018-dataset-fixes/stage1_train/* data-science-bowl-2018/merged/stage1_train/.

Now comes the Python script starting with some imports and global variables:

import os
import sys
import warnings
from subprocess import check_output

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import random
from skimage.morphology import extrema
import matplotlib as mp

from tqdm import tqdm
from itertools import chain

import skimage.transform
import skimage.morphology
from scipy import ndimage as ndi
from skimage.morphology import watershed
from skimage.feature import peak_local_max

from keras import backend as K
from keras.models import Model, load_model
from keras.layers import Input
from keras.layers.core import Dropout, Lambda
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
from keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard
from keras.preprocessing.image import ImageDataGenerator
from keras.losses import binary_crossentropy

IMG_CHANNELS = 3 # use only RGB channels
TRAIN_PATH = '/content/data-science-bowl-2018/merged/stage1_train'
TEST_PATH = '/content/data-science-bowl-2018/original/stage1_test'


cmap2 = mp.colors.ListedColormap(np.concatenate((np.array([[0,0,0]]), np.random.rand(256,3))))

warnings.filterwarnings('ignore', category=UserWarning, module='skimage')

The images in the dataset have different sizes, so we are going to resize them all to the same size; the original notebook uses 128×128 pixels, I think to keep the training fast enough for Kagglers to be able to run the notebook easily. I used 512×512, which is closer to the majority of pictures sizes, and allows us to keep more information.

I also added a function to read the pictures and merge the masks into a single image:

def imread_grey(f):
    return, as_grey=True)

def read_image_labels(image_id, merge=True):
    # return image with id number image_id 
    # and label = union of masks if merge = True (binary picture), 
    # if merge = False: label = matrix with distinct integers for each cell
    # (useful for plotting for performance checks)
    image_file = "{}/{}/images/{}.png".format(TRAIN_PATH, image_id, image_id)
    mask_file = "{}/{}/masks/*.png".format(TRAIN_PATH, image_id)
    image =
    masks =, load_func=imread_grey).concatenate()
    masks_union = np.sum(masks, axis=0) # has 2 if 2 cells are overlapping, 3 if 3 overlap etc
    if merge:
        # assumes masks and image have same dimensions
        labels = masks_union > 0 
        masks_indexed = [masks[i]*i for i in range(len(masks))]
        # overlapping cells attributed to last one in mask name order
        labels = np.max(masks_indexed, axis=0) 

    return image, labels

Example result:

We make a big simplification here by merging all the masks. This mean cells that overlap can’t be individually contoured anymore … So it’s on the user to post-process the predicted masks to separate large clusters into individual cells. This is especially important since the metric used for scoring penalises missed cells more than inaccurately contoured cells.

We then build the X and Y arrays of training and test set examples:

# Get train and test IDs
test_ids = next(os.walk(TEST_PATH))[1]
train_ids = next(os.walk(TRAIN_PATH))[1]

def preprocess(image_input):
    # force same number of channels for grayscale pics as for color pics (3)
    rgb_img = skimage.color.grey2rgb(image_input, alpha=False)
    return skimage.transform.resize(rgb_img, (IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), mode='constant', preserve_range=True)

# Get and resize train images and masks
size_train = len(train_ids)
X_train = np.zeros((size_train, IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
Y_train = np.zeros((size_train, IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.bool) # additional "dim" around each value for unet input; use np.squeeze for plots

print('Getting and resizing train images and masks ... ')
for n, image_id in tqdm(enumerate(train_ids), total=size_train):
    image, mask = read_image_labels(image_id)
    X_train[n] = preprocess(image)
    Y_train[n] = np.expand_dims(skimage.transform.resize(mask, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True), axis=-1)
    image, mask = read_image_labels(image_id, merge=False)
    Y_train_distinct[n] = np.expand_dims(skimage.transform.resize(mask, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True), axis=-1)

# Get and resize test images
X_test = np.zeros((len(test_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
sizes_test = []
print('Getting and resizing test images ... ')
for n, id_ in tqdm(enumerate(test_ids), total=len(test_ids)):
    image = + "/"+id_ + "/images/" + id_ + '.png')[:,:,:IMG_CHANNELS]
    X_test[n] = preprocess(image)

I tried different losses for the network; mean IoU as defined in this competition (using thresholds) is not available in either Keras or Tensorflow libraries. A Dice loss (intersection over union) gives the best results.

This is the loss function and the U-net network:

def dice_coef(y_true, y_pred):
    smooth = 1.
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

def dice_loss(y_true, y_pred):
    return -dice_coef(y_true, y_pred)

s = Lambda(lambda x: x / 255) (inputs)

c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (s)
c1 = Dropout(0.1) (c1)
c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c1)
p1 = MaxPooling2D((2, 2)) (c1)

c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p1)
c2 = Dropout(0.1) (c2)
c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c2)
p2 = MaxPooling2D((2, 2)) (c2)

c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p2)
c3 = Dropout(0.2) (c3)
c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c3)
p3 = MaxPooling2D((2, 2)) (c3)

c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p3)
c4 = Dropout(0.2) (c4)
c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c4)
p4 = MaxPooling2D(pool_size=(2, 2)) (c4)

c5 = Conv2D(256, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p4)
c5 = Dropout(0.3) (c5)
c5 = Conv2D(256, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c5)

u6 = Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same') (c5)
u6 = concatenate([u6, c4])
c6 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u6)
c6 = Dropout(0.2) (c6)
c6 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c6)

u7 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c6)
u7 = concatenate([u7, c3])
c7 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u7)
c7 = Dropout(0.2) (c7)
c7 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c7)

u8 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c7)
u8 = concatenate([u8, c2])
c8 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u8)
c8 = Dropout(0.1) (c8)
c8 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c8)

u9 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c8)
u9 = concatenate([u9, c1], axis=3)
c9 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u9)
c9 = Dropout(0.1) (c9)
c9 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c9)

outputs = Conv2D(1, (1, 1), activation='sigmoid') (c9)

model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer='adam', loss=dice_loss, metrics=['accuracy'])

My understanding of this is that the descending branch is a classic convnet, i.e. convolutionnal layers with several filters (or channels) to extract various features that get more and more high level with depth, interlayered with max-pooling layers to simplify the spatial localization information.  The ascending branch reconstructs a high resolution image using the learned features at every step in the descending branch, so that the detailed knowledge of what happens at high resolution (first layers) can be combined with high level understanding of the image (middle layers). The paper also does loss weighting to force the network to learn the thin boundaries between objects, which I should find time to do here as well !

In order to get more training examples, I added data augmentation, which can be done on-the-fly using a generator, i.e. there is no need to create a function to rotate/crop/flip and no need to create a new training array with the added augmentations. The training finally looks like this:

earlystopper = EarlyStopping(patience=5, verbose=1)
checkpointer = ModelCheckpoint('model-dsbowl2018.h5', verbose=1, save_best_only=True)

val_cutoff = int(size_train*0.9)
X_train2 = X_train[:val_cutoff]
Y_train2 = Y_train[:val_cutoff]
X_val = X_train[val_cutoff:]
Y_val = Y_train[val_cutoff:]

def data_generator(input, augment=False):
    # for keras on the fly data augmentation
    if augment:
        data_gen_args = dict(shear_range=0.5, rotation_range=50, zoom_range=0.2,
                         width_shift_range=0.2, height_shift_range=0.2, fill_mode='reflect')
        data_gen_args = dict()
    datagen = ImageDataGenerator(**data_gen_args)
    # Provide the same seed and keyword arguments to the fit and flow methods, augment=True, seed=42) # TODO: is this useful ?
    return datagen.flow(input, batch_size=BATCH_SIZE, shuffle=True, seed=42)

# combine generators into one which yields image and masks
train_generator = zip(data_generator(X_train2, augment=True), data_generator(Y_train2, augment=True))
val_generator = zip(data_generator(X_val), data_generator(Y_val))

# steps per epoch without data generation = nbr_samples/batch size (around 60 with batch = 10)
results = model.fit_generator(train_generator,
                              validation_data = val_generator,
                              steps_per_epoch = 250,
                              validation_steps = 10,
                              callbacks=[earlystopper, checkpointer])

I tried using Tensorboard, but I couldn’t make it work in Colaboratory using the method described in this notebook.

In order to separate the touching cells, I tried watershed postprocessing, but it invents separations everywhere on elongated shapes (although it works quite well when the touching cells look like two almost tangent circles). My simple watersheding on all the predictions does not improve performance.

The submission is built using run-length encoding (source). The submission file should look like:
0114f484a16c152baa2d82fdd43740880a762c93f436c8988ac461c5c9dbe7d5,1 1 -> nuclei #1
0999dab07b11bc85fb8464fc36c947fbd8b5d6ec49817361cb780659ca805eac,1 1 -> nuclei #1
0999dab07b11bc85fb8464fc36c947fbd8b5d6ec49817361cb780659ca805eac,2 3 8 9 -> nuclei #2

In the rle_encoding function, x is a matrix with non-1 values and 1’s (points of interest); the function returns a list of the indexes in flattened x transposed where a batch of 1’s start, followed by the number of 1’s in this batch ([index_batch_1, nbr_of_one_batch_1, index_batch_2, nbr_of_ones_batch_2 …])
For example, if x is
The function returns [3,2,12,4,24,2,37,1,46,2].

def rle_encoding(x):
    dots = np.where(x.T.flatten() == 1)[0] # indexes of the 1's
    run_lengths = []
    prev = -2
    for b in dots:
        if (b>prev+1): run_lengths.extend((b + 1, 0)) # if we hit a new batch of 1's
        run_lengths[-1] += 1 # add 1 to last element of run_length (coresponding to index of start of current batch of 1's)
        prev = b
    return run_lengths

def prob_to_rles(x, cutoff=0.5):
    # x has 0 for background, values > 0.5 for content
    #lab_img = skimage.morphology.label(x > cutoff) # label connected regions of any nonzero values in x; not needed if using watershed split
    for i in range(1, lab_img.max() + 1): # for each connected region, return an iterator on its rle representation
        yield rle_encoding(x == i)   

new_test_ids = []
rles = []
for n, id_ in enumerate(test_ids):
    split_cells = wathershed_split(preds_test_upsampled[n])
    rle = list(prob_to_rles(split_cells))
    new_test_ids.extend([id_] * len(rle))

# Create submission DataFrame
sub = pd.DataFrame()
sub['ImageId'] = new_test_ids
sub['EncodedPixels'] = pd.Series(rles).apply(lambda x: ' '.join(str(y) for y in x))
sub.to_csv('sub-dsbowl2018-12.csv', index=False)

And finally the Kaggle submission can be done via the API:

!kaggle competitions submit -c data-science-bowl-2018 -f sub-dsbowl2018-12.csv -m "added watershed splitting"

My best score using this is 0.329, not groundbreaking by any means, but. When looking at the prediction on a validation set, I was pretty impressed by how good this relatively simple network trained with few images (around 600) could segment a lot of cells. This isn’t a pretrained network that is trained on its last layers with a small dataset; the only images used are the ones from this competition.

Neural Net generated knits

Janelle Shane has trained a text generating neural network on knitting patterns (kindly contributed by Ravelry users and Stitch-Maps) and the results are hilariously absurd. The neural network has been named SkyKnit.

I tested two of these:

  • “mystery lace (copy)”; the training set does include patterns for tentacled toys, but these tentacles below are probably a consequence of SkyKnit’s inability to keep stitch counts consistent across rows. This one had some success on Twitter 🙂


  • “snowing leaves”, a pretty convincing attempt at a lace motif


SkyKnit has better performance when trained exclusively on Stitchmap patterns, which are written in a standard and consistent way, as opposed to ordinary patterns which often include natural language alongside the knitting instructions.

Draw your own dataset

For all those who want to visually test their classification algorithms on toy data, here is a RShiny app I made to easily click & draw your own custom dataset ! It can be accessed on, and the source code is on Github.

It’s mostly useful for small two-dimensionnal numeric datasets that are inconvenient to build as a few superpositions of classic distributions samplings.


Screenshot of the app

RShiny is a web application framework by RStudio that allows you to make neat interactive web apps entirely in R. Shiny is quite practical because it handles most of the event-handling and variables updating under the hood. All we need to do is to declare the variables that are subject to interactive change as reactive values, and of keep track of where the changes happen.

In this app, every time the user clicks on the canvas, a small group of points is created near the click position; the number, the class (color), and the spread of the points created at each click can be adjusted in the parameters bar. The canvas can be cleared or the latest points undone if necessary. Once the dataset is ready, it can be downloaded as a csv file containing the x,y coordinates and class of each point; the downloaded dataset is scaled to zero mean and unit variance.  This GUI is defined in the ui.R program:


  dashboardHeader(title="Draw Dataset"),
          column(4, numericInput("num_points", "Number of points per clic", 3)),
                              "Standard deviation of each clic point set",
                 selectInput("class", "Class (color)", choices=c("red"="firebrick1",
                 downloadButton("save", "Save"),
                 actionButton("clear", "Clear", icon=icon("remove")),
                 actionButton("undo", "Undo", icon=icon("undo"))
        title="Draw your own dataset by clicking on this canvas",
        plotOutput("data_plot", click="plot_click")

The code in server.R shown below holds the data frame that will contains the dataset, and updates it whenever a user triggered event happens.
The dataset is a reactive object; its potential user-triggered changes are handled in the observe({}) blocks. The variables corresponding to user input are called input$something, where the various “something” are defined in ui.R.


server <- function(input, output, session) {

  addGroup <- function(data, n, center, sigma, class){
    # append a group of points distributed around the clic coordinates
    # to a data frame holding all the points created and their color.
    new_group <- data.frame("x"=rnorm(n, mean=center$x, sd=sigma),
                            "y"=rnorm(n, mean=center$y, sd=sigma),
    return(rbind(data, new_group))

  # initialize reactive dataset holding the points created by the user
  v <- reactiveValues(data = data.frame())

    # populates the dataset with points distributed around the clic target point
    if(!is.null(input$plot_click)) {
      # "isolate" reads fresh value of v$data without the update re-evaluating it
      # avoids infinite loop of update with rbind then re-rbind the updated data with the new group
      v$data <- isolate(addGroup(v$data, input$num_points, input$plot_click, input$sigma, input$class))

    # remove all points from the canvas and the dataset when clear button is clicked
    if(!is.null(input$clear)) {
      v$data <- data.frame()

  observeEvent(input$undo, {
    # remove the latest drawn point from the dataset when undo button is clicked
    v$data <- v$data[-nrow(v$data), ]

  output$save <- downloadHandler(
    # save the dataset as a csv file
    # scale to zero mean and unit variance before saving
    filename = function() {'DIYdataset.csv'},
    content = function(file) {
      write.csv(data.frame(scale(v$data[,c("x","y")]), "color"=v$data$class),

  output$data_plot <- renderPlot({
    # display the base plot
    plot <- ggplot() + xlim(0, 100) + ylim(0, 100) + xlab("x") + ylab("y")     # if data is not empty, add it to plot     # points outside of plot boundaries are added to the dataset but not displayed     if (nrow(v$data) > 0) {
      plot <- plot + geom_point(aes(x=v$data$x, y=v$data$y),
  }, height=800)


The block in which the dataset receives new points uses “isolate” to avoid an infinite loop.
Without this, the dataset would first be updated with the new points, but RShiny would then detect that the dataset has changed and would reiterate the assignment of new points, then would detect this new change, etc.

RShiny is a great tool for quickly developing apps to demo models or visualize data; this app takes only 100 lines of code !

Using Tensorflow with Docker

I started the Google Deep Learning Course featuring the Python Tensorflow library.

Now I’m using a Windows 7 laptop, and the installation instructions for Windows on the Tensorflow website are a bit less developed than for unix based systems. The Deep Learning course comes with a Docker image packing tensorflow and the course exercises in an IPython notebook, and this seems to actually be the simplest way to get Tensorflow on Windows.

To install and get started:

  1. Download and install the Docker Toolbox
  2. Open the Docker Quickstart terminal
  3. Paste the following command in this terminal. This will download (or open if it is already downloaded) the docker image containing Tensorflow and the course exercises.
    docker run -p 8888:8888 -it --rm
  4. At the top of your Docker terminal you should see an IP address: go to http://<this_ip_adress>:8888 in your browser. This should be the home page of the Jupyter notebook featuring the course exercises. You will be able to import tensorflow when coding through this interface.

I tried this install on a mac and it works too !