1)You know your initial state completely
2) Environment observability is only local(next rooom), but it is static
3) Actions are deterministic...
The problem boils down to this:
"given your actions can you determine the configuration of your environment, in this case a maze..."
Here is the python solution... Kinda had fun time solving it.. took about 1.5 hours to figure the problem and solve it... Have fun!