---
language:
- en
library_name: sklearn
tags:
- malware-detection
- tabular-classification
- lightgbm
- scikit-learn
pipeline_tag: tabular-classification
license: mit
metrics:
- roc_auc
- accuracy
datasets:
- fabriciojoc/brazilian-malware-dataset
model-index:
- name: malware-detection-lgbm
  results:
  - task:
      type: tabular-classification
      name: Malware Detection
    dataset:
      name: Brazilian Malware Dataset (hold-out test set)
      type: tabular
    metrics:
    - type: roc_auc
      value: 0.9978
      name: AUC
    - type: accuracy
      value: 0.9895
      name: Accuracy
---

# Malware Detection LightGBM

LightGBM-based static malware detector for PE files.

## Performance (hold-out test set)

- AUC: `0.9978`
- Accuracy: `0.9895`
- Confusion matrix: `[[4158, 66], [39, 5774]]`

## Artifacts

- `production_model.joblib`
- `preprocessing_pipeline.joblib`
- `feature_names.json`

## Notes

- This repository contains model artifacts only.
- For large CSV batch inference, use the Render app:
  `https://malware-detection-ml-mihai.onrender.com/upload`