Continuing with DICOM
Further understanding DICOM and using fastai to work with DICOM
- Importing
- Creating a metadata dataframe
- Some DICOM gotchas to be aware of
- DON'T see like a radiologist!
- Cleaning the data for rapid prototyping
- Building a classifier
In this blog, we will largely use Jeremy Howard's series of notebooks on Kaggle to understand DICOM and to learn how to use fastai.medical to work with DICOM files. I would highly encourage going through Jeremy's blogs. Here, I will be merely be blogging as I learn and practice his notebooks.
There are five notebooks in this series. In this blog, we will go through the first four notebooks.
-
Creating a metadata DataFrame - In this notebook, we will learn how to create a DataFrame from DICOM metadata
-
Some DICOM gotchas to be aware of - Here, we will learn some things to watch out for when working with DICOM like signed data being used as if it were unsigned
-
Don't see like a radiologist - Radiologists, as we will learn below, use windowing to observe different area of interest such as brain, subdural because humans are limited in their ability to distinguish different contrast levels. This human limitations does not affect macines. In this notebook, we will learn how we can prepare (normalize) our images to help our model learn better.
-
Cleaning the data for rapid prototyping - We will put our learnings into action by preparing a prototyping dataset. We will clean the data and then make a smaller set of smaller JPEG images for prototyping.
-
From prototyping to submission - In this final notebook, we will learn to use our prototype dataset in building a classifier then learn to use full scale dataset to train our classifier. not covered in this blog
from fastai.basics import *
from fastai.vision.all import *
from fastai.medical.imaging import *
import seaborn as sns
The first notebook show us how to make a dataframe from dcm
files.
path_dest = Path('./')
path = Path('../input/rsna-intracranial-hemorrhage-detection/rsna-intracranial-hemorrhage-detection')
path_trn = path/'stage_2_train'
fns_trn = path_trn.ls()
path_tst = path/'stage_2_test'
fns_tst = path_tst.ls()
fn = fns_trn[0]
dcm = fn.dcmread()
dcm
Looking at an example dcm
file we can see that we have seen and cover most of the data elements in our previous blog except Window Center
, Window Width
, Rescale Intercept
and Rescale Slope
. We will focus on them.
(0028, 1050) Window Center
and
(0028, 1051) Window Width
The grayscale values of a CT is made up of a range of pixel values. Since they are either 12-bit or 16-bit, the range of possible values are 0-4096 (for 12-bit) and 0-65536 (for 16-bit). This is in comparison to a normal 8-bit images whose values range between 0-255. The larger the range, larger is the levels of contrast. Our human eyes are merely good at contrasting 100 levels of contrast. A 12-bit image would therefore provide 4096 contrast levels which could be difficult for humans to differentiate. Hence, in CTs, it is normal to choose a range of value to observe. The midlevel of the range is the Window Center
and the range is Window Level
. We can also think of varying Window Center
as varying the brighness and varying the Window Level
as varying the contrast.
fastai provides the Window Level
and Window Center
for observing different area of interest. Let's take a look at few
dicom_windows
dicom_windows.brain
dicom_windows.subdural
Let's use fastai functionality to see some of these images under different windowing.
scales = False, True, dicom_windows.brain, dicom_windows.subdural
titles = 'raw','normalized','brain windowed','subdural windowed'
for s,a,t in zip(scales, subplots(2,2,imsize=5)[1].flat, titles):
dcm.show(scale=s, ax=a, title=t)
Rescale Intercept
and Rescale Slope
Now, let's move to understanding Rescale Intercept
and Rescale Slope
. The concept is pretty simple. DICOM uses linear transformation to save pixel values when stored on disk and when it is moved to memory.
y = mx + c
m
is the Rescale Slope
and c
is the Rescale Intercept
.
Why is this linear transformation needed?
CT scans are meaured in Hounsfield Units which can be negative and they are stored as unsigned integers
format which goes from 0
and above. Hence, a linear transformation is needed to shift the range of values.
Now that we have understood some of the important data elements. Let's prepare the dataframes so it would be easier to work with. fastai/pydicom provides really easy way to convert DICOM to dataframe.
def save_lbls():
path_lbls = path/'stage_2_train.csv'
lbls = pd.read_csv(path_lbls)
lbls[["ID","htype"]] = lbls.ID.str.rsplit("_", n=1, expand=True)
lbls.drop_duplicates(['ID','htype'], inplace=True)
pvt = lbls.pivot('ID', 'htype', 'Label')
pvt.reset_index(inplace=True)
pvt.to_feather('labels.fth')
#df_lbls.head(8)
#df_tst.to_feather('df_tst.fth')
#df_tst.head()
#df_trn.to_feather('df_trn.fth')
From here, we will work on Jeremy's "Some DICOM gotchas to be aware of".
path_df = Path('../input/dataframes')
path_df.ls()
df_lbls = pd.read_feather(path_df/'labels.fth')
df_tst = pd.read_feather(path_df/'df_tst.fth')
df_trn = pd.read_feather(path_df/'df_trn.fth')
We merge the df_lbls
dataframe with df_trn
.
comb = df_trn.join(df_lbls.set_index('ID'), 'SOPInstanceUID')
assert not len(comb[comb['any'].isna()])
comb.head(10)
Recap
BitsStored
- Tells whether the data is stored in 12 or 16 bits
PixelRepresentation
- Tells if the data it is signed or unsigned data. Signed data can have negative pixels while unsigned starts from 0 and above.
repr_flds = ['BitsStored','PixelRepresentation']
comb.pivot_table(values=['img_mean','img_max','img_min','PatientID','any'], index=repr_flds,
aggfunc={'img_mean':'mean','img_max':'max','img_min':'min','PatientID':'count','any':'mean'})
We can see that that when PixelRepresentation
is 1, meaning the data type is signed, img_min
can take negative values. Largely, we will be working with unsigned 12-bit data and signed 16-bit data.
As we saw earlier RescaleIntercept
and RescaleSlope
tell us how to scale our data. Let's take a look.
comb.pivot_table(values=['WindowCenter','WindowWidth', 'RescaleIntercept', 'RescaleSlope'], index=repr_flds,
aggfunc={'mean','max','min','std','median'})
Firstly, we know that CT scans are measured in Hounsfield Units (HU) which takes both negative and positive values. HU measures the radiodensity (how much radiation is absorbed) by differnt materials such as air, fats, bones. Some radiodensity values of different materials are Air -100 HU; Fat -120 to -90 HU; Bone (Cortical) +500 to +1900
Based on this, we expect the mean RescaleIntercept
for PixelRepresentation
0 to be around -1,024 but in our case there seem to be some with RescaleIntercept
not equlas to -1,024.
The issue here could be that some of the images were signed
data but were treated as unsigned
. Later we will look at how to deal with them.
Now, lets take a look scaled_px
function that fastai provides to easliy scale pixels based on their RescaleIntercept
and RescaleSlope
dcm = path_trn.ls(1)[0].dcmread()
dcm.pixels
plt.hist(dcm.pixels.flatten().numpy());
dcm.scaled_px
plt.hist(dcm.scaled_px.flatten().numpy());
Next, we will work on Jeremy's "DON'T see like a radiologist!" notebook.
In this notebook, Jeremy's presents an idea as to how to help our model see better. We have seen earlier windowing
is used to help radiologists to vary contrast and brightness to observe difffernt areas of interests such as the brain and the subdural. The reason for windowing
is humans are only able to contrast about 100 levels of contrast gradient but a 16-bit CT has 2^16 (65,536) levels of contrast gradient. This is beyond a human's ability to distinguish hence windowning is used but this is not a problem for computer.
Let's see an image without windowing
vs brain_window
.
fname = path_trn.ls(10)[0]
_, axes = plt.subplots(1, 2, figsize=(10,8))
dcm = fname.dcmread()
for ax, name, window in zip(axes.flat, ['no windowing', 'brain wondow'], [None, dicom_windows.brain]):
dcm.show(scale=window, ctx=ax, title=name)
This limitations of number of contrast gradient is not at all an issue for computer but we do have to rescale to help our models. Let's take a look at Jeremy's proposal. Let's see the rescaling.
px = dcm.scaled_px.flatten()
sns.histplot(px)
We see a highly bimodal distribution. Background pixels are around -1000, and the brain tissue pixels are around 0. The proposal is to use non-linear mapping designed to give us an equal number of pixels in each range. Let's see how that is done.
bins = px.freqhist_bins(20)
sns.histplot(dcm.scaled_px.flatten(), bins=bins)
'fastai.medical.imaging' can apply that non-linear mapping for you. In fact, this is the default way of displaying a DICOM in fastai. This is the normalized windowing we saw earlier.
plt.imshow(dcm.hist_scaled(), cmap=plt.cm.bone);
dcm.show()
Now that we know how to scale our dcm, let's take a look at preparing our dataset. Because the non-linear mapping we used varies from image to image, we will need to create a mapping that is appropriate for a wide range of images.
We will create a mapping for three different groups of images we saw earlier.
df1 = comb.query('(BitsStored==12) & (PixelRepresentation==0)')
df2 = comb.query('(BitsStored==12) & (PixelRepresentation==1)')
df3 = comb.query('BitsStored==16')
dfs = L(df1,df2,df3)
To create the bins
- we will grab a random image with each label (
htypes
) and one with no labels for all the three groups above - then we will read the dcm
- put them into a tensor
- get the bins from this set of images
- use the bins in
hist_scaled_px
to get the scaled float tensor, which we can pass to our model
htypes = 'any','epidural','intraparenchymal','intraventricular','subarachnoid','subdural'
def get_samples(df):
recs = [df.query(f'{c}==1').sample() for c in htypes]
recs.append(df.query('any==0').sample())
return pd.concat(recs).fname.values
sample_fns = concat(*dfs.map(get_samples))
sample_dcms = L(Path(o).dcmread() for o in sample_fns)
samples = torch.stack(tuple(sample_dcms.attrgot('scaled_px')))
samples.shape
bins = samples.freqhist_bins()
plt.plot(bins, torch.linspace(0,1,len(bins)));
dcm.show(bins)
dcm.hist_scaled(bins), dcm.hist_scaled(bins).shape
Since the non-linear mapping results in a almost uniform distribution between 0 and 1, we wont have a mean and std of 0 and 1.
scaled_samples = torch.stack(tuple(o.hist_scaled(bins) for o in sample_dcms))
scaled_samples.mean(),scaled_samples.std()
In this notebook, we learn
- how to fix images with incorrect
RescaleIntercept
- removing images with minimal useful information
- make a smaller dataset that we can use for prototyping
- crop images to just contain the brain area
- carry out histogram rescaling and save it as JPEG
The idea, apart from fixing incorrect images, is to create a dataset for rapid prototyping.
The problematic images were found in our df1.
df1 = comb.query('(BitsStored==12) & (PixelRepresentation==0)')
def df2dcm(df): return L(Path(o).dcmread() for o in df.fname.values)
df_iffy = df1[df1.RescaleIntercept>-100]
dcms = df2dcm(df_iffy)
_,axs = subplots(2,4, imsize=3)
for i,ax in enumerate(axs.flat): dcms[i].show(ax=ax)
That does not look good at all.
dcm = dcms[2]
d = dcm.pixel_array
plt.hist(d.flatten());
As explained previously, the mode for unsigned data appears around 0 but in our case the mode appears around 3000. This could be because signed data could have been treated as if it were unsigned data. As explained by Jeremy
My guess is that what happened in the "iffy" images is that they were actually signed data, >but were treated as unsigned. If that's the case, the a value of -1000 or -1024 (the usual >values for background pixels in signed data images) will have wrapped around to 4096->1000=3096. So we'll need to shift everything up by 1000, then move the values larger than >2048 back to where they should have been.
The fix is
- add all pixel values by +1000
- for values more than 4096, -1000
- set RescaleIntercept to -1000
d += 1000
px_mode = scipy.stats.mode(d.flatten()).mode[0]
d[d>=px_mode] = d[d>=px_mode] - px_mode
dcm.PixelData = d.tobytes()
dcm.RescaleIntercept = -1000
plt.hist(dcm.pixel_array.flatten());
_,axs = subplots(1,2)
dcm.show(ax=axs[0]); dcm.show(dicom_windows.brain, ax=axs[1])
def fix_pxrepr(dcm):
if dcm.PixelRepresentation != 0 or dcm.RescaleIntercept<-100: return
x = dcm.pixel_array + 1000
px_mode = 4096
x[x>=px_mode] = x[x>=px_mode] - px_mode
dcm.PixelData = x.tobytes()
dcm.RescaleIntercept = -1000
dcms = df2dcm(df_iffy)
dcms.map(fix_pxrepr)
_,axs = subplots(2,4, imsize=3)
for i,ax in enumerate(axs.flat): dcms[i].show(ax=ax)
Images look better following our repair work
To remove 'not so useful' images we will identify what % of pixel of an images is in the brain window (0, 80). This is already in the dataframe we created earlier under img_pct_window
.
plt.hist(comb.img_pct_window,40);
comb = comb.assign(pct_cut = pd.cut(comb.img_pct_window, [0,0.02,0.05,0.1,0.2,0.3,1]))
comb.pivot_table(values='any', index='pct_cut', aggfunc=['sum','count']).T
We can see that images with little brain tissue (<2% of pixels) have almost no labels. So we will remove them.
comb.drop(comb.query('img_pct_window<0.02').index, inplace=True)
df_lbl = comb.query('any==True')
n_lbl = len(df_lbl)
n_lbl
df_nonlbl = comb.query('any==False').sample(n_lbl//2)
len(df_nonlbl)
comb = pd.concat([df_lbl,df_nonlbl])
len(comb)
dcm = Path(dcms[10].filename).dcmread()
fix_pxrepr(dcm)
px = dcm.windowed(*dicom_windows.brain)
show_image(px);
blurred = gauss_blur2d(px, 100)
show_image(blurred);
show_image(blurred>0.3);
_,axs = subplots(1,4, imsize=3)
for i,ax in enumerate(axs.flat):
dcms[i].show(dicom_windows.brain, ax=ax)
show_image(dcms[i].mask_from_blur(dicom_windows.brain), cmap=plt.cm.Reds, alpha=0.6, ax=ax)
def pad_square(x):
r,c = x.shape
d = (c-r)/2
pl,pr,pt,pb = 0,0,0,0
if d>0: pt,pd = int(math.floor( d)),int(math.ceil( d))
else: pl,pr = int(math.floor(-d)),int(math.ceil(-d))
return np.pad(x, ((pt,pb),(pl,pr)), 'minimum')
def crop_mask(x):
mask = x.mask_from_blur(dicom_windows.brain)
bb = mask2bbox(mask)
if bb is None: return
lo,hi = bb
cropped = x.pixel_array[lo[0]:hi[0],lo[1]:hi[1]]
return pad_square(cropped)
_,axs = subplots(1,2)
dcm.show(ax=axs[0])
dcm_m = PILCTScan.create(crop_mask(dcm))
dcm_m.show(ax=axs[1]);
Now, we will learn to save our smaller images as JPEG for fast prototyping. First, we will sample our dataset to get out freqhist_bins
. And then use the samples to calculate our bins.
htypes = 'any','epidural','intraparenchymal','intraventricular','subarachnoid','subdural'
def get_samples(df):
recs = [df.query(f'{c}==1').sample() for c in htypes]
recs.append(df.query('any==0').sample())
return pd.concat(recs).fname.values
sample_fns = concat(*dfs.map(get_samples))
sample_dcms = tuple(Path(o).dcmread().scaled_px for o in sample_fns)
samples = torch.stack(sample_dcms)
bins = samples.freqhist_bins()
sample_fns.shape
bins, bins.shape
with open(f"{path_dest}/bin.pkl", "wb") as output_file:
pickle.dump(bins, output_file)
Next, we will make a function to read a single dcm
file and then fixes them using the fix_pxrepr
funtion we create earlier.
def dcm_tfm(fn):
fn = Path(fn)
try:
x = fn.dcmread()
fix_pxrepr(x)
except Exception as e:
print(fn,e)
raise SkipItemException
if x.Rows != 512 or x.Columns != 512: x.zoom_to((512,512))
return x.scaled_px
We will then make fastai's TfmDL
to use parallel processing to process the images.
comb.fname.values
fns = list(comb.fname.values)
dest = path_dest/'train_jpg'
dest.mkdir(exist_ok=True)
# NB: Use bs=512 or 1024 when running on GPU
bs=4
ds = Datasets(fns, [[dcm_tfm],[os.path.basename]])
dl = TfmdDL(ds, bs=bs, num_workers=2)
As we can see below, the dataloader will return a tuple of scaled_px
and its corresponding filename.
dl.one_batch()
The following functions return output filename and save the cropped .jpg
files.
def dest_fname(fname): return dest/Path(fname).with_suffix('.jpg')
def save_cropped_jpg(o, dest):
fname,px = o
px.save_jpg(dest_fname(fname), [dicom_windows.brain, dicom_windows.subdural], bins=bins)
In the next function, we make the following
- move the pixels from dataloader to device
- make our masks for cropping only the brain portion
- then make the crop
- use parallel funtioality to save the images. The
save_jpg
will save brain window, subdural wondow and normalised. Each as one chanel.
def process_batch(pxs, fnames, n_workers=4):
pxs = to_device(pxs)
masks = pxs.mask_from_blur(dicom_windows.brain)
bbs = mask2bbox(masks)
gs = crop_resize(pxs, bbs, 256).cpu().squeeze()
parallel(save_cropped_jpg, zip(fnames, gs), n_workers=n_workers, progress=False, dest=dest)
%time process_batch(*dl.one_batch(), n_workers=3)
Let's open and see one of the images.
fn = dest.ls()[0]
im = Image.open(fn)
fn
As can be seen below, each channel in the saved jpg
corresponds to brain window, subdural window and normalized.
axs = subplots(1, 3)[1].flat
for i, ax in zip(tensor(im).permute(2,0,1), axs):
ax.imshow(i)
ax.axis('off')
The above can be simply shown using fastai
functionality.
show_images(tensor(im).permute(2,0,1), titles=['brain','subdural','normalized'])
#dest.mkdir(exist_ok=True)
#for b in progress_bar(dl): process_batch(*b, n_workers=8)
df_comb = comb.set_index('SOPInstanceUID')
df_tst = pd.read_feather(path_df/'df_tst.fth')
The next two functions are what we looked in the previous section. We will use the fix_pxrepr
function to fix the PixelRepresentation
issue we encountered. And make a 3-channel image using brain window, subdural window and normalised image using the freq_his
bins we developed earlier.
def fix_pxrepr(dcm):
if dcm.PixelRepresentation != 0 or dcm.RescaleIntercept<-100: return
x = dcm.pixel_array + 1000
px_mode = 4096
x[x>=px_mode] = x[x>=px_mode] - px_mode
dcm.PixelData = x.tobytes()
dcm.RescaleIntercept = -1000
def dcm_tfm(fn):
fn = (path_trn/fn).with_suffix('.dcm')
try:
x = fn.dcmread()
fix_pxrepr(x)
except Exception as e:
print(fn,e)
raise SkipItemException
if x.Rows != 512 or x.Columns != 512: x.zoom_to((512,512))
px = x.scaled_px
return TensorImage(px.to_3chan(dicom_windows.brain,dicom_windows.subdural, bins=bins))
Let's make a split based on PatientID
set_seed(42)
patients = df_comb.PatientID.unique()
pat_mask = np.random.random(len(patients))<0.8
pat_trn = patients[pat_mask]
def split_data(df):
idx = L.range(df)
mask = df.PatientID.isin(pat_trn)
return idx[mask],idx[~mask]
splits = split_data(df_comb)
filename
function returns the filename while fn2image
opens the filename and returns PILDicom
object
def filename(o): return os.path.splitext(os.path.basename(o))[0]
fns = L(list(df_comb.fname)).map(filename)
fn = fns[0]
fn
def fn2image(fn): return PILDicom.create((path_trn/fn).with_suffix('.dcm'))
fn2image(fn).show();
fnlabel
return the labels given a filename
htypes = ['any','epidural','intraparenchymal','intraventricular','subarachnoid','subdural']
def fn2label(fn): return df_comb.loc[fn][htypes].values.astype(np.float32)
fn2label(fn)
get_loss
prepares the weighted loss fuction
def get_loss(scale=1.0):
loss_weights = tensor(2.0, 1, 1, 1, 1, 1).cuda()*scale
return BaseLoss(nn.BCEWithLogitsLoss, pos_weight=loss_weights, floatify=True, flatten=False,
is_2d=False)
accuracy_any
calculates the accuracy for any
label
def accuracy_any(inp, targ, thresh=0.5, sigmoid=True):
inp,targ = flatten_check(inp[:,0],targ[:,0])
if sigmoid: inp = inp.sigmoid()
return ((inp>thresh)==targ.bool()).float().mean()
loss_func = get_loss(0.14*2)
opt_func = partial(Adam, wd=0.01, eps=1e-3)
metrics=[accuracy_multi,accuracy_any]
Next, we will need to prepare the dataloaders.
First, we will need the transformations necessary to open/prepare (we will us dcm_tfm
) the images and prepare the labels (we will use fn2label
followed by EncodedMultiCategorize
which makes one-hot encoded multi-category labels).
tfms = [[dcm_tfm], [fn2label, EncodedMultiCategorize(htypes)]]
dls = Datasets(fns, tfms, splits=splits)
nrm = Normalize(tensor([0.6]),tensor([0.25]))
aug = aug_transforms(p_lighting=0.)
batch_tfms = [nrm, *aug]
def get_data(bs, sz):
return dls.dataloaders(bs=bs, num_workers=nw,
after_item=[ToTensor],
after_batch=batch_tfms+[AffineCoordTfm(size=sz)])
dbch = get_data(64,256)
x,y = dbch.one_batch()
dbch.show_batch(max_n=4)
x.shape
bs = 32
sz=128
def get_learner(bs, sz):
dls = get_data(bs,sz)
learn = cnn_learner(dls,
xresnet50,
loss_func=loss_func,
opt_func=opt_func,
metrics=metrics)
return learn.to_fp16()
learn = get_learner(bs, sz)
learn.fit_one_cycle(1, 1e-3)