Most R object comparison functions are good at telling you that objects are different, but less so at conveying *how* they are different. I wrote `diffobj`

to provide an “aha, that’s how they are different” comparison. In this vignette I will compare `diffPrint`

to `all.equal`

and to `testthat::compare`

.

Disclaimer: I picked the examples here to showcase `diffobj`

capabilities, not to carry out a fair and balanced comparison of these comparison functions. Nonetheless, I hope you will find the examples representative of common situations where comparison of R objects is useful.

I defined four pairs of numeric vectors for us to compare. I purposefully hid the variable definitions to simulate a comparison of unknown objects.

`all.equal(A1, B1)`

`## [1] "Mean relative difference: 0.1"`

The objects are different… At this point I would normally print both `A1`

and `B1`

to try to figure out how that difference came about since the “mean relative difference” is unhelpful.

`testthat::compare(A1, B1)`

```
## 1/10 mismatches
## [10] 10 - 11 == -1
```

`testthat::compare`

does a better job, but I still feel the need to look at `A1`

and `B1`

.

`diffPrint(A1, B1)`

@@ 1 @@@@ 1 @@<[1] 1 2 3 4 5 6 7 8 9 10>[1] 1 2 3 4 5 6 7 8 9 11

Aha, that’s how they are different!

Let’s up the difficulty a little bit:

`testthat::compare(A2, B2)`

```
## 20/20 mismatches (average diff: 1.9)
## [1] 1 - 20 == -19
## [2] 2 - 1 == 1
## [3] 3 - 2 == 1
## [4] 4 - 3 == 1
## [5] 5 - 4 == 1
## [6] 6 - 5 == 1
## [7] 7 - 6 == 1
## [8] 8 - 7 == 1
## [9] 9 - 8 == 1
## ...
```

If you look closely you will see that despite a reported 20/20 differences, the two vectors are actually similar, at least in the part visible part of the output. With `diffPrint`

it is obvious that `B2`

and is the same as `A2`

, except that the last value has been moved to the first position:

`diffPrint(A2, B2)`

@@ 1,2 @@@@ 1,2 @@<[1] 1 2 3 4 5 6 7 8 9 10 11>[1] 20 1 2 3 4 5 6 7 8 9 10<[12] 12 13 14 15 16 17 18 19 20>[12] 11 12 13 14 15 16 17 18 19

`testthat::compare`

throws in the towel as soon as lengths are unequal:

`testthat::compare(A3, B3)`

`## Lengths differ: 20 is not 21`

`all.equal`

does the same. `diffPrint`

is unfazed:

`diffPrint(A3, B3)`

@@ 1,2 @@@@ 1,2 @@<[1] 1 2 3 4 5 6 7 8 9 10 11>[1] 20 21 1 2 3 4 5 6 7 8 9<[12] 12 13 14 15 16 17 18 19 20>[12] 10 11 12 13 14 15 16 17 18 19

`diffPrint`

also produces useful output for largish vectors:

```
A4 <- 1:1e4
B4 <- c(1e4 + 1, A4[-c(4:7, 9e3)])
diffPrint(A4, B4)
```

@@ 1,4 @@@@ 1,4 @@<[1] 1 2 3 4 5>[1] 10001 1 2 3 8<[6] 6 7 8 9 10>[6] 9 10 11 12 13[11] 11 12 13 14 15[11] 14 15 16 17 18[16] 16 17 18 19 20[16] 19 20 21 22 23@@ 1798,5 @@@@ 1798,5 @@[8986] 8986 8987 8988 8989 8990[8986] 8989 8990 8991 8992 8993[8991] 8991 8992 8993 8994 8995[8991] 8994 8995 8996 8997 8998<[8996] 8996 8997 8998 8999 9000>[8996] 8999 9001 9002 9003 9004[9001] 9001 9002 9003 9004 9005[9001] 9005 9006 9007 9008 9009[9006] 9006 9007 9008 9009 9010[9006] 9010 9011 9012 9013 9014

Do note that the comparison algorithm scales with the square of the number of *differences*, so very large and different vectors will be slow to process.

R Core and package authors put substantial effort into `print`

and `show`

methods. `diffPrint`

takes advantage of this. Compare:

`all.equal(iris, iris[-60,])`

```
## [1] "Attributes: < Component \"row.names\": Numeric: lengths (150, 149) differ >"
## [2] "Component \"Sepal.Length\": Numeric: lengths (150, 149) differ"
## [3] "Component \"Sepal.Width\": Numeric: lengths (150, 149) differ"
## [4] "Component \"Petal.Length\": Numeric: lengths (150, 149) differ"
## [5] "Component \"Petal.Width\": Numeric: lengths (150, 149) differ"
## [ reached getOption("max.print") -- omitted 3 entries ]
```

to:

`diffPrint(iris, iris[-60,])`

@@ 59,5 / 59,4 @@~Sepal.Length Sepal.Width Petal.Length Petal.Width Species58 4.9 2.4 3.3 1.0 versicolor59 6.6 2.9 4.6 1.3 versicolor<60 5.2 2.7 3.9 1.4 versicolor61 5.0 2.0 3.5 1.0 versicolor62 5.9 3.0 4.2 1.5 versicolor

And:

`all.equal(lm(hp ~ disp, mtcars), lm(hp ~ cyl, mtcars))`

```
## [1] "Component \"coefficients\": Names: 1 string mismatch"
## [2] "Component \"coefficients\": Mean relative difference: 2.778944"
## [3] "Component \"residuals\": Mean relative difference: 0.7074011"
## [4] "Component \"effects\": Names: 1 string mismatch"
## [5] "Component \"effects\": Mean relative difference: 0.5907086"
## [ reached getOption("max.print") -- omitted 9 entries ]
```

to:

`diffPrint(lm(hp ~ disp, mtcars), lm(hp ~ cyl, mtcars))`

@@ 1,8 @@@@ 1,8 @@Call:Call:<lm(formula = hp ~ disp, data = mtcars)>lm(formula = hp ~ cyl, data = mtcars)Coefficients:Coefficients:<(Intercept) disp>(Intercept) cyl<45.7345 0.4376>-51.05 31.96

In these examples I limited `all.equal`

output to five lines for the sake of brevity. Also, since `testthat::compare`

reverts to `all.equal`

output with more complex objects I omit it from this comparison.

Another candidate comparison function is `compare::compare`

. I omitted it from this vignette because it focuses more on similarities than on differences. Additionally, `testthat::compare`

and `compare::compare`

`print`

methods conflict so they cannot be used together.

For a more thorough exploration of `diffobj`

methods and their features please see the primary `diffobj`

vignette.