Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastForst binary does not have Probability although documentation says it has. #7398

Open
superichmann opened this issue Feb 24, 2025 · 1 comment
Labels
untriaged New issue has not been triaged

Comments

@superichmann
Copy link

why no Probability?

polyglot vscode c# notebook:

#r "nuget:Microsoft.ML"
#r "nuget:Microsoft.ML.LightGbm"
#r "nuget:Microsoft.ML.FastTree"
using System;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

public class ModelInput
{
public float Feature1 { get; set; }
public float Feature2 { get; set; }
public bool Label { get; set; }
}


// Create a new MLContext
var mlContext = new MLContext();

// Define the training data schema
var data = new[]
{
    new ModelInput { Feature1 = 1f, Feature2 = 2f, Label = true },
    new ModelInput { Feature1 = 3f, Feature2 = 4f, Label = false },
    new ModelInput { Feature1 = 5f, Feature2 = 6f, Label = true },
    new ModelInput { Feature1 = 7f, Feature2 = 8f, Label = false },
    new ModelInput { Feature1 = 9f, Feature2 = 10f, Label = true }
};

// Load the training data
var trainData = mlContext.Data.LoadFromEnumerable(data);

// Define the LightGBM binary classification trainer
var trainer = mlContext.BinaryClassification.Trainers.FastForest();

// Train the model
var pipeline = mlContext.Transforms.Concatenate("Features", nameof(ModelInput.Feature1), nameof(ModelInput.Feature2))
    .Append(trainer);

var model = pipeline.Fit(trainData);

// Define new data points for prediction
var newData = new[]
{
    new ModelInput { Feature1 = 2f, Feature2 = 3f },
    new ModelInput { Feature1 = 4f, Feature2 = 5f },
    new ModelInput { Feature1 = 6f, Feature2 = 7f },
    new ModelInput { Feature1 = 8f, Feature2 = 9f },
    new ModelInput { Feature1 = 10f, Feature2 = 11f }
};

// Load the new data
var newDataView = mlContext.Data.LoadFromEnumerable(newData);

// Make predictions on the new data
var transformedNewData = model.Transform(newDataView);

// Extract the Probability column
var probabilities = transformedNewData.GetColumn<float>("Probability").ToArray();

// Extract the Feature1 and Feature2 columns
var feature1 = newData.Select(x => x.Feature1).ToArray();
var feature2 = newData.Select(x => x.Feature2).ToArray();

// Print the Probability scores for each prediction
for (int i = 0; i < probabilities.Length; i++)
{
    Console.WriteLine($"Feature1: {feature1[i]}, Feature2: {feature2[i]}, Probability: {probabilities[i]}");
}

Error: System.ArgumentOutOfRangeException: Column 'Probability' not found (Parameter 'name')
at Microsoft.ML.DataViewSchema.get_Item(String name)
at Microsoft.ML.Data.ColumnCursorExtensions.GetColumn[T](IDataView data, String columnName)
at Submission#18.<>d__0.MoveNext()
--- End of stack trace from previous location ---
at Microsoft.CodeAnalysis.Scripting.ScriptExecutionState.RunSubmissionsAsync[TResult](ImmutableArray1 precedingExecutors, Func2 currentExecutor, StrongBox1 exceptionHolderOpt, Func2 catchExceptionOpt, CancellationToken cancellationToken)

Documentation says yes have probability. why not have?

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged label Feb 24, 2025
@superichmann
Copy link
Author

change to lgbm see yes probability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
untriaged New issue has not been triaged
Projects
None yet
Development

No branches or pull requests

1 participant