Skip to content

Refactor preprocessing

Emanuele Roberto Nocera requested to merge refactor_preprocessing into master

Created by: APJansen

This PR simplifies the preprocessing layer, improving readability without changing any results. I think these changes are harmless and uncontroversial and can easily be merged.

Additional changes proposed

However my goal was a slightly bigger change, that may cause issues so I put the last commit in a different branch. What this does is unite all the preprocessing factors into a single alpha vector and a single beta vector, rather than having tons of scalars. This layer is the only place in the model where flavors are treated individually. Each parameter has flavor-dependent min and max values, that go into both the initializer and a constraint, but this can be done as a vector as well.

The reason this may be controversial is that it doesn't allow setting weights as trainable on a flavor-by-flavor basis, only all alphas and/or all betas. I don't see why this would be necessary, but I think I did see a runcard that does this.

Timing

I did some timing tests as well, creating a model with only the preprocessing layer and training it on random targets for 10_000 epochs:

branch time (s)
master 38
refactor_preprocessing 36
prepro_join_weights 24

The timing script is something like:

    input = Input(shape=(200, 1), batch_size=1)
    prepro = Preprocessing(flav_info=flav_info, seed=0)
    output = prepro(input)
    model = Model(inputs=input, outputs=output)
    test_x = tf.random.uniform(shape=(1, 200, 1), seed=42)
    test_y = tf.random.uniform(shape=(1, 200, 8), seed=43)

    model.compile(loss="mse", optimizer='adam')

    start = datetime.now()
    model.fit(test_x, test_y, epochs=10_000)
    end = datetime.now()
    diff = end - start

checks

  • refactor_preprocessing: Regression test passes, so identical results (note though, even when specifying a seed, the initialization depends on the tensorflow version, the regression test passes on Snellius but not on my laptop)
  • prepro_join_weights: initialization is different when the weights are a vector, so the regression test needs to be updated. But I checked that manually setting the weights to what they were in master, the results are still the same. Other regression tests and the structure of saved models will need to be changed as well. I'll do that only if these changes are approved.

Merge request reports

Loading