Discussion:
Remove Semigroup and Monoid instances for Either
Andrew Martin
2018-05-24 13:56:48 UTC
Permalink
The Semigroup and Monoid instances for Either have surprising behaviors.
Currently, the Semigroup instance is written as:

instance Semigroup (Either a b) where
Left _ <> b = b
a <> _ = a

One would expect that the Semigroup instances for `Maybe a` and `Either ()
a` would have the same behaviors, but they don't. We could do better with:

instance Semigroup b => Semigroup (Either a b) where
(<>) = liftA2 (<>)

And now the behaviors match. This generalizes the old Semigroup instance,
which could be recovered with `Either a (Data.Semigroup.First b)`.

For now, all I'm proposing is removing the existing Semigroup and Monoid
instances for Either. I do not believe that new instances should be given
immidiately, since that would be the worst kind of breaking change: one
that the compiler does not error on.

-Andrew Thaddeus Martin
Herbert Valerio Riedel
2018-05-24 14:28:56 UTC
Permalink
Andrew,
Post by Andrew Martin
One would expect that the Semigroup instances for `Maybe a` and `Either ()
a` would have the same behaviors
Given this expectation is used as motivation for the proposal, could you
expand the rationale from which you derive this expectation? Are there
data-points that give reason to believe that the current instance are
actually problematic, as they result in code being written that behaves
in an unexpected way that isn't detected right-away by programmers?
Post by Andrew Martin
For now, all I'm proposing is removing the existing Semigroup and Monoid
instances for Either. I do not believe that new instances should be given
immidiately, since that would be the worst kind of breaking change: one
that the compiler does not error on.
If the intent is to re-add different instance at a later time, then I
strongly oppose this IMHO wrong reasoning -- either get rid of the
instances forever, or switch over gap-less

We had a very similar discussion a couple months ago; I'll just refer
to the argument from then which applies here as well with minor
modifications, see below


-------------------- Start of forwarded message --------------------
Subject: Re: Proposal: Remove Semigroup and Monoid instances for
Data.Map, Data.IntMap, Data.HashMap
From: Oleg Grenrus <***@iki.fi>
Message-ID: <ab7a0a16-1c64-1a9d-0733-***@iki.fi>
Date: Wed, 14 Feb 2018 17:48:36 +0200

I don't like a "no-instances" window. We can change the instances to correct
ones directly. That's what semantic versioning is about, to be able to
communicate semantic changes.

We should go directly to

   Semigroup v => Semigroup (Map k v)
   Semigroup v => Monoid (Map k v)

If `containers-0.6` or `containers-1.0` included some other (compile-time)
breaking change: even better.

I want to point out that a lot of code *will* break at compile-time with the
change.  In polymorphic setting `v` doesn't have `Semigroup v`
attached.  Many
concrete types arent `Semigroup` (e.g. `Int`; here and later when I say
isn't
`Semigroup`, I mean that it isn't `Monoid` either).

I tried to compile our `futurice-prelude` [1] and one small size internal
app (40 modules) with patched containers: things break at compile time

This change *won't get thru unnoticed*. See below for details.

Also people (we and others) use `mempty` for `Map.empty`, but for `mempty`
semantics won't change. We might need to use `Map.empty` if `v` isn't
`Semigroup`,
but that's catched by compiler. Note: that would mean to use `Map.empty`
everywhere
if there is no-instances window.

I also conjecture, based on the results, that `Ord k => Monoid (Map k
v)` for
`union` is currently avoided in libs. Maybe because it does *the wrong
thing*,
so people don't use it. In our own 50kLOC codebase we have three
`Map.unionWith` or `Map.unionsWith`, one of which is

    groupThings = Map.unionsWith (<>) . map f).

I think `Map k String` or `Map k Text` things might be common in private
codebases (not ours). Maybe not the most popular opinion, but I think they
deserve to break.


Something else: If we replace the instance, we have to leave it out of
`containers` for older than `base-4.9`.
- Either we provide `Semigroup` (and `Monoid` !) in `semigroups` (See +) or
- just have `base >= 4.9` in that release of `containers`
  (that will be three GHCs soon).


To conclude, The brekage on Hackage would be driven by things not having
`Semigroup` instances, not semantics. There might be subtle breakages,
but not
something we as community cannot handle. So let's just make a change.

Best regards, Oleg

+ At this point we want `Semigroup v` for `Monoid (Map k v)`, and because
  `semigroups` depend on `containers` (and not other way), we'd need to have
  `Monoid (Map k v)` also in `semigroups`.

---

I did an experiment: compile our `futurice-prelude` [1] with amended
`containers`, I changed only instances for `Map`.


Removal of the instances breaks Cabal, so I could test far enough what
happens:

    Distribution/Simple/Program/GHC.hs:546:12: error:
        • No instance for (Semigroup (Map Extension String))
            arising from a use of ‘gmempty’
        • In the expression: gmempty
          In an equation for ‘mempty’: mempty = gmempty
          In the instance declaration for ‘Monoid GhcOptions’
        |
    546 |   mempty = gmempty
        |            ^^^^^^^

This is one place where we want `union` semantics.
OTOH in current development version of Cabal the map is of type

    Map Extension (Maybe Compiler.Flag),

as we moving avoid from stringly-typed programming. And `Flag` doesn't have
`Semigroup` instance.


Second step: Let's see where Monoid (Map k v) is used. We cannot
deprecate the
instance, but we can se where it's used by a -fdefer-type-error trick
(thanks Bartosz Nitka aka niteria [2])
Note that this step actually has changes from third step already, so
we don't see failures where `Semigroup v` context would have caused
compilation
failure.

To my happy surprise the instance were used only once in Cabal-2.0.1.1
(above),
and then in `futurice-prelude` itself where it would also break with
`Semigroup v`.
So 231 packages at the bottom of dependency graph of lens+servant apps
don't use `Monoid (Map k v)`. Interesting fact, I'd say.


Third step: If I change instance to use `unionWith (<>)`, then the first
failure in our production code is `reducers-3.12.2`

    src/Data/Semigroup/Reducer.hs:233:10: error:
        • Could not deduce (Semigroup v)
            arising from the superclasses of an instance declaration
          from the context: Ord k
            bound by the instance declaration
            at src/Data/Semigroup/Reducer.hs:233:10-42
          Possible fix:
            add (Semigroup v) to the context of the instance declaration
        • In the instance declaration for ‘Reducer (k, v) (Map k v)’
        |
    233 | instance Ord k => Reducer (k, v) (Map k v) where

and `servant-0.13`

    src/Servant/Server/Internal.hs:691:34: error:
        • No instance for (Data.Semigroup.Semigroup
                             (Router' env RoutingApplication))
            arising from a use of ‘mempty’
        • In the first argument of ‘StaticRouter’, namely ‘mempty’
          In the expression: StaticRouter mempty mempty
          In an equation for ‘route’:
              route Proxy _ _ = StaticRouter mempty mempty
        |
    691 |   route Proxy _ _ = StaticRouter mempty mempty

and `swagger2-2.2`.  `SwaggerMonoid` differes for `Maybe`, but has default
implementation in terms of `Monoid`.

    src/Data/Swagger/Internal/Utils.hs:116:10: error:
        • Could not deduce (Data.Semigroup.Semigroup v)
            arising from a use of
‘Data.Swagger.Internal.Utils.$dmswaggerMappend’
          from the context: Ord k
            bound by the instance declaration
            at src/Data/Swagger/Internal/Utils.hs:116:10-41
          Possible fix:
            add (Data.Semigroup.Semigroup v) to the context of
              the instance declaration
        • In the expression:
            Data.Swagger.Internal.Utils.$dmswaggerMappend @Map k v
          In an equation for ‘swaggerMappend’:
              swaggerMappend
                = Data.Swagger.Internal.Utils.$dmswaggerMappend @Map k v
          In the instance declaration for ‘SwaggerMonoid (Map k v)’
        |
    116 | instance Ord k => SwaggerMonoid (Map k v)

And then we get to `futurice-prelude` which also fails:

    src/Futurice/Prelude.hs:177:41: error:
        • Could not deduce (Semigroup v) arising from a use of ‘ifolded’

Funny thing about that code, is that there we work with `Map k (Map l
v)` and
fold them.  For other one we want `UnionWith` and for other any semantic
should
be ok  (we have tests, they *will* break if I get things wrong).

We also have a `Map` wrapper which also breaks loudly.


Fourth step: fix above and contunue investigation where `Monoid (Map k
v)` is
used.  I built one internal application which pulls in even more
dependencies
in addition to what `futurice-prelude` already depends upon (465 deps, e.g.
`amazonka`).

- Two of our internal libraries use `mempty` (second one a lot). Not
breaking.

- One package uses a `Map` wrapper used above (for the sake of experiment I
  passed the `Warning` constraint down)

- The app itself uses `mempty` of `folded` on things which aren't
`Semigroup`.

- `yaml-0.8.28` breaks:

    Data/Yaml/Parser.hs:154:13: error:
        • Could not deduce (Data.Map.Internal.Warning YamlValue)
            arising from a use of ‘await’

  `YamlValue` doesn't have a `Semigroup` (or `Monoid`) instance.

- `amazonka-core-1.5.0`: `mempty`: non silent breakage.

    src/Network/AWS/Data/XML.hs:252:49: error:
        • No instance for (containers-0.5.11.0:Data.Map.Internal.Warning
                             Text)
            arising from a use of ‘mempty’
        • In the second argument of ‘Element’, namely ‘mempty’
          In the first argument of ‘NodeElement’, namely
            ‘(Element (name k) mempty [NodeContent v])’
          In the expression:
            NodeElement (Element (name k) mempty [NodeContent v])
        |
    252 |     node (k, v) = NodeElement (Element (name k) mempty
[NodeContent v])


[1]: https://github.com/futurice/haskell-futurice-prelude
[2]: https://ghc.haskell.org/trac/ghc/ticket/12014#comment:6
Post by Andrew Martin
David,
Alright, then let's do a little Gedankenexperiment; what if you release
a containers-0.9.0.0 (which drops the instances) and a
containers-1.0.0.0 (which 'adds back' the desired new instances) on the
same day!
...wouldn't this allow us to have the cake and eat it too? ;-)
It would let us have some cake. Users would be able to test against
0.9, in theory. But they'd have to do it intentionally.
This is what I was trying to get at: that's always the case. It doesn't
matter whether you release it same day, a week later, or a year later. I
have to intentionally *not* skip the 0.9 release. In fact, I'd want to
skip the 0.9 release myself if possible, as supporting both,
containers-0.9 and containers-1.0 in the same consumer code is going to
be awkward enough for me not wanting to do that (or I'd have to
basically not use the instances at all); If I care about the new
instances, I'd rather specify `containers ^>= 1.0.0.0` and thus not
support containers-0.9.
I would have proceeded to point out in my Gedankenexperiment, that in
order to use such a container-0.9 release, you'd have to force every
package that depends on `containers` to go through such a 0.9 phase in
lockstep, which in itself poses an ordeal in terms of package update
logistic; and then if you spread the time between the 0.9 and the 1.0
release apart, repeat this dance a 2nd time, with yet another new major
version bump, which requires yet another round of consumer-code reviews
(*because* a major version increment was signalled; and we do that as
API consumers are supposed to pay attention to the major version
increment signal...)
So consequently, if anything, a container-0.9 release would be a kind of
an artificial pseudo-release anyway IMO. You could even just condense it
into a cabal package flag of a containers-1.0 release, which disables
the Monoid/Semigroup instances; then you don't even need a distinct
container-0.9 release at all, nor do you have to contaminate the API
progression with an artificial container-0.9 discontinuity.
And Stack-based projects would probably need some shenanigans to deal
with a version of containers that doesn't come with GHC.
I'm not convinced we should let the weakest link hold us back. If
there's limitations in the tooling, we should rather address them, rather
than contort ourselves; c.f. the Genius Tailor
( http://www.wealthculture.cn/infoShow.asp?nid=880&sort=Article%20List )
So I really think that avoiding these subtle bugs requires at least a
full GHC release cycle.
I don't think waiting a full GHC release would be either necessary nor
effective at addressing your concerns, which I consider to be disputable
arguments to begin with.
-- hvr
-------------------- End of forwarded message --------------------
Andrew Martin
2018-05-24 14:48:40 UTC
Permalink
There's a project at work that a colleague and I worked on where we both
assumed that the Monoid instance for Either would behave like the one I
proposed, and then it didn't. There's also someone on reddit (
https://www.reddit.com/r/haskell/comments/8lbzan/semigroup_maybe_too_strict/dzhwzjs/)
There is a bug somewhere anyway as instances for Either () a and Maybe a
really should match, but I'm no longer sure that the bug is in the Maybe
instance rather than the Either () a one.

Those are my two data points. It's small, but I wanted to gauge what others
thought.

I'm familiar with the discussion around the Semigroup/Monoid instance for
Map. There was no general consensus on whether or not there should be a
gap, and I fall pretty strongly on the other side of the fence for reasons
that are listed elsewhere in that thread. In the case of the
Semigroup/Monoid instances for Either, I don't really care as much since
the current instances have such little utility that I've never even seen
anyone use them in the wild.
Andrew,
Post by Andrew Martin
One would expect that the Semigroup instances for `Maybe a` and `Either
()
Post by Andrew Martin
a` would have the same behaviors
Given this expectation is used as motivation for the proposal, could you
expand the rationale from which you derive this expectation? Are there
data-points that give reason to believe that the current instance are
actually problematic, as they result in code being written that behaves
in an unexpected way that isn't detected right-away by programmers?
Post by Andrew Martin
For now, all I'm proposing is removing the existing Semigroup and Monoid
instances for Either. I do not believe that new instances should be given
immidiately, since that would be the worst kind of breaking change: one
that the compiler does not error on.
If the intent is to re-add different instance at a later time, then I
strongly oppose this IMHO wrong reasoning -- either get rid of the
instances forever, or switch over gap-less
We had a very similar discussion a couple months ago; I'll just refer
to the argument from then which applies here as well with minor
modifications, see below
-------------------- Start of forwarded message --------------------
Subject: Re: Proposal: Remove Semigroup and Monoid instances for
Data.Map, Data.IntMap, Data.HashMap
Date: Wed, 14 Feb 2018 17:48:36 +0200
I don't like a "no-instances" window. We can change the instances to correct
ones directly. That's what semantic versioning is about, to be able to
communicate semantic changes.
We should go directly to
Semigroup v => Semigroup (Map k v)
Semigroup v => Monoid (Map k v)
If `containers-0.6` or `containers-1.0` included some other (compile-time)
breaking change: even better.
I want to point out that a lot of code *will* break at compile-time with the
change. In polymorphic setting `v` doesn't have `Semigroup v`
attached. Many
concrete types arent `Semigroup` (e.g. `Int`; here and later when I say
isn't
`Semigroup`, I mean that it isn't `Monoid` either).
I tried to compile our `futurice-prelude` [1] and one small size internal
app (40 modules) with patched containers: things break at compile time
This change *won't get thru unnoticed*. See below for details.
Also people (we and others) use `mempty` for `Map.empty`, but for `mempty`
semantics won't change. We might need to use `Map.empty` if `v` isn't
`Semigroup`,
but that's catched by compiler. Note: that would mean to use `Map.empty`
everywhere
if there is no-instances window.
I also conjecture, based on the results, that `Ord k => Monoid (Map k
v)` for
`union` is currently avoided in libs. Maybe because it does *the wrong
thing*,
so people don't use it. In our own 50kLOC codebase we have three
`Map.unionWith` or `Map.unionsWith`, one of which is
groupThings = Map.unionsWith (<>) . map f).
I think `Map k String` or `Map k Text` things might be common in private
codebases (not ours). Maybe not the most popular opinion, but I think they
deserve to break.
Something else: If we replace the instance, we have to leave it out of
`containers` for older than `base-4.9`.
- Either we provide `Semigroup` (and `Monoid` !) in `semigroups` (See +) or
- just have `base >= 4.9` in that release of `containers`
(that will be three GHCs soon).
To conclude, The brekage on Hackage would be driven by things not having
`Semigroup` instances, not semantics. There might be subtle breakages,
but not
something we as community cannot handle. So let's just make a change.
Best regards, Oleg
+ At this point we want `Semigroup v` for `Monoid (Map k v)`, and because
`semigroups` depend on `containers` (and not other way), we'd need to
have
`Monoid (Map k v)` also in `semigroups`.
---
I did an experiment: compile our `futurice-prelude` [1] with amended
`containers`, I changed only instances for `Map`.
Removal of the instances breaks Cabal, so I could test far enough what
• No instance for (Semigroup (Map Extension String))
arising from a use of ‘gmempty’
• In the expression: gmempty
In an equation for ‘mempty’: mempty = gmempty
In the instance declaration for ‘Monoid GhcOptions’
|
546 | mempty = gmempty
| ^^^^^^^
This is one place where we want `union` semantics.
OTOH in current development version of Cabal the map is of type
Map Extension (Maybe Compiler.Flag),
as we moving avoid from stringly-typed programming. And `Flag` doesn't have
`Semigroup` instance.
Second step: Let's see where Monoid (Map k v) is used. We cannot
deprecate the
instance, but we can se where it's used by a -fdefer-type-error trick
(thanks Bartosz Nitka aka niteria [2])
Note that this step actually has changes from third step already, so
we don't see failures where `Semigroup v` context would have caused
compilation
failure.
To my happy surprise the instance were used only once in Cabal-2.0.1.1
(above),
and then in `futurice-prelude` itself where it would also break with
`Semigroup v`.
So 231 packages at the bottom of dependency graph of lens+servant apps
don't use `Monoid (Map k v)`. Interesting fact, I'd say.
Third step: If I change instance to use `unionWith (<>)`, then the first
failure in our production code is `reducers-3.12.2`
• Could not deduce (Semigroup v)
arising from the superclasses of an instance declaration
from the context: Ord k
bound by the instance declaration
at src/Data/Semigroup/Reducer.hs:233:10-42
add (Semigroup v) to the context of the instance declaration
• In the instance declaration for ‘Reducer (k, v) (Map k v)’
|
233 | instance Ord k => Reducer (k, v) (Map k v) where
and `servant-0.13`
• No instance for (Data.Semigroup.Semigroup
(Router' env RoutingApplication))
arising from a use of ‘mempty’
• In the first argument of ‘StaticRouter’, namely ‘mempty’
In the expression: StaticRouter mempty mempty
route Proxy _ _ = StaticRouter mempty mempty
|
691 | route Proxy _ _ = StaticRouter mempty mempty
and `swagger2-2.2`. `SwaggerMonoid` differes for `Maybe`, but has default
implementation in terms of `Monoid`.
• Could not deduce (Data.Semigroup.Semigroup v)
arising from a use of
‘Data.Swagger.Internal.Utils.$dmswaggerMappend’
from the context: Ord k
bound by the instance declaration
at src/Data/Swagger/Internal/Utils.hs:116:10-41
add (Data.Semigroup.Semigroup v) to the context of
the instance declaration
swaggerMappend
In the instance declaration for ‘SwaggerMonoid (Map k v)’
|
116 | instance Ord k => SwaggerMonoid (Map k v)
• Could not deduce (Semigroup v) arising from a use of ‘ifolded’
Funny thing about that code, is that there we work with `Map k (Map l
v)` and
fold them. For other one we want `UnionWith` and for other any semantic
should
be ok (we have tests, they *will* break if I get things wrong).
We also have a `Map` wrapper which also breaks loudly.
Fourth step: fix above and contunue investigation where `Monoid (Map k
v)` is
used. I built one internal application which pulls in even more
dependencies
in addition to what `futurice-prelude` already depends upon (465 deps, e.g.
`amazonka`).
- Two of our internal libraries use `mempty` (second one a lot). Not
breaking.
- One package uses a `Map` wrapper used above (for the sake of experiment I
passed the `Warning` constraint down)
- The app itself uses `mempty` of `folded` on things which aren't
`Semigroup`.
• Could not deduce (Data.Map.Internal.Warning YamlValue)
arising from a use of ‘await’
`YamlValue` doesn't have a `Semigroup` (or `Monoid`) instance.
- `amazonka-core-1.5.0`: `mempty`: non silent breakage.
• No instance for (containers-0.5.11.0:Data.Map.Internal.Warning
Text)
arising from a use of ‘mempty’
• In the second argument of ‘Element’, namely ‘mempty’
In the first argument of ‘NodeElement’, namely
‘(Element (name k) mempty [NodeContent v])’
NodeElement (Element (name k) mempty [NodeContent v])
|
252 | node (k, v) = NodeElement (Element (name k) mempty
[NodeContent v])
[1]: https://github.com/futurice/haskell-futurice-prelude
[2]: https://ghc.haskell.org/trac/ghc/ticket/12014#comment:6
Post by Andrew Martin
David,
Alright, then let's do a little Gedankenexperiment; what if you release
a containers-0.9.0.0 (which drops the instances) and a
containers-1.0.0.0 (which 'adds back' the desired new instances) on the
same day!
...wouldn't this allow us to have the cake and eat it too? ;-)
It would let us have some cake. Users would be able to test against
0.9, in theory. But they'd have to do it intentionally.
This is what I was trying to get at: that's always the case. It doesn't
matter whether you release it same day, a week later, or a year later. I
have to intentionally *not* skip the 0.9 release. In fact, I'd want to
skip the 0.9 release myself if possible, as supporting both,
containers-0.9 and containers-1.0 in the same consumer code is going to
be awkward enough for me not wanting to do that (or I'd have to
basically not use the instances at all); If I care about the new
instances, I'd rather specify `containers ^>= 1.0.0.0` and thus not
support containers-0.9.
I would have proceeded to point out in my Gedankenexperiment, that in
order to use such a container-0.9 release, you'd have to force every
package that depends on `containers` to go through such a 0.9 phase in
lockstep, which in itself poses an ordeal in terms of package update
logistic; and then if you spread the time between the 0.9 and the 1.0
release apart, repeat this dance a 2nd time, with yet another new major
version bump, which requires yet another round of consumer-code reviews
(*because* a major version increment was signalled; and we do that as
API consumers are supposed to pay attention to the major version
increment signal...)
So consequently, if anything, a container-0.9 release would be a kind of
an artificial pseudo-release anyway IMO. You could even just condense it
into a cabal package flag of a containers-1.0 release, which disables
the Monoid/Semigroup instances; then you don't even need a distinct
container-0.9 release at all, nor do you have to contaminate the API
progression with an artificial container-0.9 discontinuity.
And Stack-based projects would probably need some shenanigans to deal
with a version of containers that doesn't come with GHC.
I'm not convinced we should let the weakest link hold us back. If
there's limitations in the tooling, we should rather address them, rather
than contort ourselves; c.f. the Genius Tailor
( http://www.wealthculture.cn/infoShow.asp?nid=880&sort=Article%20List )
So I really think that avoiding these subtle bugs requires at least a
full GHC release cycle.
I don't think waiting a full GHC release would be either necessary nor
effective at addressing your concerns, which I consider to be disputable
arguments to begin with.
-- hvr
-------------------- End of forwarded message --------------------
--
-Andrew Thaddeus Martin
Loading...