Polysemy - Part III - Tests
This is part of a series on effect handling in Haskell using Polysemy
One of the benefits of writing pure code is that it’s so easy to test. You provide input, you get output, you assert on the output, that’s it. But “real world applications” have many functions with effects all over the place. And we also need to test those to ensure quality.
The problem is, how do you test effectful code? As the name indicates, naive tests would have various effects, rendering them “hard” to both write and run.
A solution particularly favored in languages with no clear effect boundary is to use ephemeral “containerized” environments, like Docker containers, to run their PostgreSQL databases, Kafka clusters, etc., during tests. These ephemeral containers lower the pain of testing effectful code, but with limited benefits: they remain slow and they are rather complex to write/maintain.
Another solution, when effectful code is well separated from business logic, is to mock the effects, i.e. to “replace” them with fake logic that suits each test exactly. These mocks usually have a greater LOC cost per test compared to containerized environments, but are extremely fast, and easier to maintain over time.
And guess what? Effect frameworks like Polysemy make it pretty simple to mock effects in tests.
Mocking effects
All we need to do to mock an effect is to change the interpreter layer! The effect declaration and the effect use in business code remain unchanged.
I will reuse the example from my previous post:
import Polysemy
data Log m a where
LogInfo :: String -> Log m ()
'Log
makeSem '
myBusinessFunction :: Member Log r => Integer -> Integer -> Sem r Integer
= do
myBusinessFunction m n $ "myBusinessFunction was called with parameters " <> show m <>
logInfo " and " <> show n
let result = m + n
$ "myBusinessFunction result is " <> show result
logInfo pure result
logToIO :: Member (Embed IO) r => Sem (Log ': r) a -> Sem r a
= interpret (\(LogInfo stringToLog) -> embed $ putStrLn stringToLog) logToIO
Step 1: No mocking
Let’s first write a test where the logging still happens in IO
(logging to stdout
):
import Polysemy
import Test.HUnit
= TestCase $ do
test_1and2is3 <- runM . logToIO $ myBusinessFunction 1 2
result @?= 3 result
The test passes but stdout
was polluted with the logs:
Cases: 1 Tried: 0 Errors: 0 Failures: 0myBusinessFunction was called with parameters 1 and 2
myBusinessFunction result is 3
Cases: 1 Tried: 1 Errors: 0 Failures: 0
Imagine when you have hundreds of tests running, your terminal (or your CI logs) will quickly get cluttered. Even worse: if our logging interpreter logged in a file, each test run would create and write into such a log file! This would also not scale well with other effects, like database calls.
We can do better.
Step 2: Disable logs
Our goal is to write tests for myBusinessFunction
without actually logging to stdout
.
Let’s write another interpreter logToSilence
for the Log
effect, except this interpreter will simply ignore the log and do nothing:
logToSilence :: Sem (Log ': r) a -> Sem r a
= interpret (\(LogInfo _) -> pure ())
logToSilence
= TestCase $ do
test_1and2is3 <- runM . logToSilence $ myBusinessFunction 1 2
result @?= 3 result
Now the logs are clean, the effect was nicely interpreted:
Cases: 1 Tried: 1 Errors: 0 Failures: 0
Note, we previously had to run in IO
because logToIO
required it. Our silencing interpreter is pure, though, so we can replace Polysemy’s runM
with run
and work with pure code:
= TestCase $
test_1and2is3 let result = run . logToSilence $ myBusinessFunction 1 2
in result @?= 3
Step 3: Test effects too
Rather than silencing those logs, maybe logging is part of our requirements. In such case, we should actually check that the function logs correctly!
Let’s replace our silencing interpreter with another one, that records all logs, so that we can check exactly what was logged. We will rely on another pre-existing Polysemy effect, namely Polysemy.Writer
, which is the Polysemy equivalent of Writer
or WriterT
.
Long story short, a Writer a
effect allows you to write values of type a
which will be glued together using mappend
(thus the Monoid
constraint). Note you can’t read those values as long as you are in code under this effect. Then when interpreting this effect, the result will be a pair of the resulting written a
and the value returned by the effectful code. For our needs, a ~ [String]
, i.e. we record each log and we will get the list of all logged lines when interpreting this Writer [String]
effect:
import Polysemy
import Polysemy.Writer
import Test.HUnit
logToRecord :: Member (Writer [String]) r => Sem (Log ': r) a -> Sem r a
= interpret (\(LogInfo stringToLog) -> tell [stringToLog])
logToRecord
= TestCase $
test_1and2is3 let (logs, result) = run . runWriter . logToRecord $ myBusinessFunction 1 2
in do
@?= 3
result @?= [ "myBusinessFunction was called with parameters 1 and 2"
logs "myBusinessFunction result is 3" ] ,
So what’s going on here?
logToRecord
interprets theLog
effect in terms ofWriter [String]
, i.e. we record all the logged lines as a list of strings (usingtell
fromPolysemy.Writer
to add logs)- We run this
Writer
effect usingrunWriter
(this will aggregate all recorded logs thanks to theMonoid
constraint, using list appending) - We can now assert both on the business result
3
and on the logged lines
That’s it! We have successfully removed the IO
effect, our tests are pure, yet we can fully assert on both the business results and the effects!
Note: Technically we should use runWriterAssocR
instead of runWriter
since the monoid is a list, for performance reasons, but this is beyond the scope of this post.
It is exactly the same for property based tests (PBT). Say we want to (quick)check our business function is associative (on the result) and check that the last log will be the same (others will not because of order of application):
import Polysemy
import Polysemy.Writer
import Test.QuickCheck
logToRecord :: Member (Writer [String]) r => Sem (Log ': r) a -> Sem r a
= interpret (\(LogInfo stringToLog) -> tell [stringToLog])
logToRecord
= \a b c ->
test_associative let
= run . runWriter . logToRecord $ do
(logsAB_then_C, resultAB_then_C) <- myBusinessFunction a b
resultAB
myBusinessFunction resultAB c= run . runWriter . logToRecord $ do
(logsA_then_BC, resultA_then_BC) <- myBusinessFunction b c
resultBC
myBusinessFunction a resultBCin
last logsAB_then_C == last logsA_then_BC && resultAB_then_C == resultA_then_BC
You can find the full code example on my Github repo.
This first example showed how to change the interpreter to mock/record the effect behavior, but a major part of mocking effects is to return a dummy value instead of executing the effect to retrieve the value (e.g. database access or environment variable).
I think it’s interesting to showcase another example where the effect action has a return value other than ()
.
Intermediary example: environment variables
Let’s consider the use case of environment variable access.
Many applications need to read some global configuration, often passed by Kubernetes/Rancher through environment variables or secret files. A database URL, the logging level, a port number, an API key, you name it.
The effect declaration
import Polysemy
data Configuration m a where
ReadConf :: String -> Configuration m (Maybe String)
'Configuration makeSem '
No surprise here. When you readConf
, you have to pass the parameter name to read, and you get back a Maybe String
(Nothing
if the parameter is not configured).
The effect use in business code
import Data.Maybe
import Polysemy
myBusinessFunction :: Member Configuration r => Int -> Sem r (Either String Int)
= do
myBusinessFunction amount <- fmap (fmap read) (readConf "MINIMUM_AMOUNT")
maybeMinimumAmount let minimumAmount = fromMaybe 500 maybeMinimumAmount
pure $ if amount >= minimumAmount
then Right amount
else Left $ show amount ++ " is lower than the minimum allowed amount " ++ show minimumAmount
This function reads the MINIMUM_AMOUNT
configuration setting, or uses 500
as default value, then checks that the passed value is greater than or equal to the minimum amount.
Note: the double fmap
may look weird, this is because we want to convert the resulting String
to an Int
but there are 2 layers to map over: IO
and Maybe
.
Again, the business code is not concerned with how the configuration is retrieved. Is it from an environment variable? A file? A cache? A hardcoded value? Or a combination of those? This decision is up to the interpreter!
The interpreters
This is an example of interpreter that reads from environment variables:
import System.Environment
import Polysemy
confToIO :: Member (Embed IO) r => Sem (Configuration ': r) a -> Sem r a
= interpret (\(ReadConf envVarName) -> embed $ lookupEnv envVarName) confToIO
Now let’s write a mock interpreter for our tests!
As stated in introduction, mocking means that each test gets to decide the behavior of effects. In this particular case, it means the decision of how to transform the configuration name (e.g. MINIMUM_AMOUNT
) to a value (of type Maybe String
) is up to each test, not to the interpreter.
Said differently, the interpreter should take as argument how to do this transformation.
In a functional language, it means: the interpreter should take as argument the function String -> Maybe String
, and each test should pass such a function (the mock behavior).
confToMock :: (String -> Maybe String) -> Sem (Configuration ': r) a -> Sem r a
= interpret (\(ReadConf envVarName) -> pure $ mockLookupEnv envVarName) confToMock mockLookupEnv
And now a couple of unit tests showing how to use it:
import Test.HUnit
= TestCase $
test_defaultMinimumAmount_lower let
= Nothing
mockLookupEnv _ = run . confToMock mockLookupEnv $ myBusinessFunction 400
result in
@?= Left "400 is lower than the minimum allowed amount 500"
result
= TestCase $
test_minimumAmount_greater let
"MINIMUM_AMOUNT" = Just "250"
mockLookupEnv = Nothing
mockLookupEnv _ = run . confToMock mockLookupEnv $ myBusinessFunction 400
result in
@?= Right 400 result
As you can see, each test provides the mocking function mockLookupEnv
which is then injected in the interpreter.
You can find the full code example on my Github repo.
Conclusion
As shown in this post, testing (by mocking) is nice and inexpensive in Haskell with an effect framework like Polysemy.
The separation between effect declaration, effect use and effect interpretation is a big plus: tests only need to change the interpreter to mock effects, all other things remaining equal.
We have been using this technique for more than 6 months in my team, and we truly enjoy it. Most of us have experience with mocking techniques and frameworks in other languages (e.g. Java, JavaScript) but testing in Haskell with Polysemy is far more enjoyable, and by a long shot.
Remember: all the things shown here are loosely coupled to the testing libraries (HUnit and QuickCheck) and the effect library (Polysemy). You can achieve similar benefits with any testing or effect library that relies on the same separation of effect declaration, effect use and effect interpretation.
Enjoy testing!