Effect of Different Insecticides on Insects

Analyzing the effectiveness of insecticides
Machine Learning
GLM
Poisson Regression
Author

Olamide Adu

Published

March 21, 2025

Image by Arjun MJ ## Introduction Understanding the efficacy of various insecticides is crucial in agriculture, environmental science, and pest control. The effectiveness of a spray determines both economic and ecological outcomes—ensuring crops are protected while minimizing chemical overuse.

In this post, we’ll analyze the InsectSprays dataset, a classic dataset in R, using generalized linear modeling with a Poisson regression framework. We’ll explore which spray kills the most insects on average and whether there’s a statistically significant difference among them.

Loading the Data and Required Libraries

To begin, we load the necessary libraries:

pacman::p_load(tidyverse, tidymodels, poissonreg)

Next we access the dataset.

insect_spray <- InsectSprays
head(insect_spray)
  count spray
1    10     A
2     7     A
3    20     A
4    14     A
5    14     A
6    12     A

This dataset consists of insect counts (count) across different spray types (spray), giving us a perfect use case for modeling count data.

Summary Statistics

Let’s start by computing basic summaries to understand the distribution of effectiveness across sprays:

insect_spray |> 
  summarise(
    .by = spray,
    average_count = mean(count),
    times_used = n()
  )
  spray average_count times_used
1     A     14.500000         12
2     B     15.333333         12
3     C      2.083333         12
4     D      4.916667         12
5     E      3.500000         12
6     F     16.666667         12

This output gives us the average number of insects killed per spray and how many observations are available for each.

Visualizing the Results

insect_spray |> 
  ggplot(aes(fct_reorder(spray, count), count)) +
  geom_col(fill = "dodgerblue4") +
  labs(
    x = "Spray",
    y = "Frequency",
    title = "Total Insects Killed per Spray"
  ) +
  coord_flip() +
  theme_light(
    base_family = "Inter"
  ) +
  theme(
    plot.title = element_text(
      hjust = .5,
      size = 14,
      face = "bold"
    )    
  )

This bar chart gives a clear view of the total effectiveness of each insecticide. Some sprays–B,F, and A–clearly outperform others.

insect_spray |> 
  summarise(
    .by = spray,
    average_count = mean(count)
  ) |> 
  ggplot(
    aes(fct_reorder(spray, average_count), average_count)
  ) +
  geom_col(fill = "coral3") +
  labs(
    x = "Spray",
    y = "Mean Insect Killed",
    title = "Mean insect killed by Sprays"
  ) +
  theme_light(base_family = "Inter") +
  coord_flip()

This visualization highlights the average efficacy of each spray. The difference in means provides a strong foundation for statistical modeling.

Modeling Insecticide Effectiveness with Poisson Regression

Since we are dealing with count data, Poisson regression is a natural choice. We’ll fit a Generalized Linear Model (GLM) with a Poisson distribution to examine the differences between spray types.

insect_mod <- poisson_reg() |> 
  set_mode("regression") |> 
  set_engine("glm") |> 
  fit(
    count ~ spray,
    data = insect_spray
  )

Interpreting the Model

Let’s extract and interpret the coefficients:

insect_mod |> 
  extract_fit_engine() |> 
  tidy()
# A tibble: 6 × 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)   2.67      0.0758    35.3   1.45e-272
2 sprayB        0.0559    0.106      0.528 5.97e-  1
3 sprayC       -1.94      0.214     -9.07  1.18e- 19
4 sprayD       -1.08      0.151     -7.18  7.03e- 13
5 sprayE       -1.42      0.172     -8.27  1.37e- 16
6 sprayF        0.139     0.104      1.34  1.79e-  1

The output shows the log-mean counts relative to the reference spray (often the first alphabetical level unless changed). A negative coefficient implies lower effectiveness compared to the baseline, while positive values indicate improved performance.

Each coefficient can be exponentiated (exp(coef)) to interpret the relative rate of insect death compared to the baseline.

Conclusion

Through this simple analysis, we’ve:

Explored the insecticide effectiveness visually and numerically

Modeled the count of insects using a Poisson GLM

Identified which sprays perform significantly better than others

This kind of analysis not only helps in selecting the most effective spray but also supports data-driven decision-making in ecological management and agricultural planning.