54
loading...
This website collects cookies to deliver better user experience
pip install gym_super_mario_bros nes_py
import gym_super_mario_bros
from nes_py.wrappers import JoypadSpace
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import gym_super_mario_bros
- is importing the game itself.from nes_py.wrappers import JoypadSpace
-from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
- By default, gym_super_mario_bros environments use the full NES action space of 256 discrete actions. To constrain this, gym_super_mario_bros.actions provide three actions lists (RIGHT_ONLY, SIMPLE_MOVEMENT, and COMPLEX_MOVEMENT) we are using SIMPLE_MOVEMENT which has only 7 actions that help us to reduce the data we are going to process.
actions are the combination of controls in the game.(Bonus: Simplify the environment as much as possible the more complex it is the harder it is going to be for our AI to learn how to play that game. We will also convert the RGB image into a grayscale image which again helps us in reducing the data we need to be processed)
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)
done = True
for step in range(5000):
if done:
state = env.reset()
state, reward, done, info = env.step(env.action_space.sample())
env.render()
env.close()
done = True
setting the flag to true. Tells whether the game needs to restart or not.for step in range(5000):
everything when a screen gets updated we say it to do specific actions.env.reset()
.env.step()
to pass the action to the game like saying it to press a button to move right, left, etc...env.action_space.sample()
it gives random actions.state, reward, done, info
will return us some data to process with.env.render()
this allows us to show the game on the screen.env.close()
this allows us to close the game or terminate that game.OSError: exception: access violation reading 0x000000000003C200
if you get any of these access violation errors just restart your Kernel and run the imports again this will do the job.
!pip3 install torch torchvision torchaudio
!pip install stable-baselines3[extra]
from gym.wrappers import GrayScaleObservation
from stable_baselines3.common.vec_env import VecFrameStack, DummyVecEnv
from matplotlib import pyplot as plt
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)
env = GrayScaleObservation(env, keep_dim=True)
env = DummyVecEnv([lambda: env])
env = VecFrameStack(env, 4,channels_order='last')
GrayScaleObservation(env, keep_dim=True)
- This commend helps to convert our environment from RGB to grayscale. keep_dim=True
is used to obtain the final channel and it helps in frame stacking.RGB Image (240*256*3)
= 184320
pixels to process
Grayscale (240*256*1)
= 61440
pixels to process
env = DummyVecEnv([lambda: env])
- We wrap all the images in a dummy vectorization environment each time we run this step the shape of the data gets changed.env = VecFrameStack(env, 4,channels_order='last')
- We pass our preprocessed environment and how many frames we are going to stack in our case 4
you can add more if you need to and finally where our channel order is specified at by channels_order='last'
.stack shape is represented at the end.
code:
state = env.reset()
state.shape
output:
(1, 240, 256, 1)
= one channel
(1, 240, 256, 4)
= four channels
state = env.reset()
state, reward, done, info = env.step([5])
[['NOOP'],
['right'],
['right', 'A'],
['right', 'B'],
['right', 'A', 'B'],
['A'],
['left']]
state
- Our current state in the environmentreward
- A positive score gives when our Mario performs well in the game else NOPE.
Our main goal is to maximize the total rewards.done
- It says whether Mario is dead or not, the game is over or not...info
- we get some information about the environment like {'coins': 0,
'flag_get': False,
'life': 2,
'score': 0,
'stage': 1,
'status': 'small',
'time': 400,
'world': 1,
'x_pos': 40,
'y_pos': 79}
plt.figure(figsize=(20,16))
for idx in range(state.shape[3]):
plt.subplot(1,4,idx+1)
plt.imshow(state[0][:,:,idx])
plt.show()