1. 英語でサイエンスしナイト
  2. #32 データ取るのも大変、解析..
2023-07-09 19:30

#32 データ取るのも大変、解析も大変【科学系ポッドキャストゆる合同イベント Part II】

良質なデータが取れても、それがどんなストーリーなのかはちゃんと解析しないと分からないんだよね…


【英語でサイエンスしナイト】 最近帰国した研究者と、なかなか帰国出来ない帰国子女研究者eggによる、ほぼ英語・時々日本語・だいたいサイエンスなゆるゆるポッドキャストです♪ ちょっと知的好奇心も満たせるフリー英語教材的に聞き流してもらえると喜びます! 

-----------------------

Twitter: @eigodescience

Links: ⁠https://linktr.ee/eigodescience⁠

Music: Rice Crackers by Aves


00:12
Yeah, I envy your field. Human behaviors are much more noisier and unpredictable.
Yeah, up to the day of the experiment, we have so many conditions that we ask the subject to
follow, like instructions. But yeah, some people don't really follow those instructions.
Specifically for your research, without revealing too much,
what's a good sample size in your field?
Usually when we do some kind of preliminary experiments, if the data quality is good enough
and if the effect size is large, usually after five or six data samples, I would say,
okay, let's go ahead, or no, something is wrong, something isn't working out.
But then for the main experiment, 12, I would say 12 is enough. However,
recently reviewers wouldn't accept that small sample size anymore, so they would ask
us to collect more data, like 20, 20 subjects per condition.
Oh, but you're still talking like tens of people, not like hundreds or thousands.
No, we don't need that. And it would take years and years to collect 100 subject data
per condition. That's impossible.
Sometimes this happens in medical research or maybe sometimes social science research,
but you know how there's an experiment that goes on for 50 plus years and people are doing
a long-term tracking type of watching people, like if kids grew up in these kind of conditions,
what happens to them when they're 75? Exactly, things like that. Can you imagine being part
of that kind of experiment where you don't really know what to say about it for maybe longer than
your research career life? Yeah, that's interesting.
Though, to be able to contribute to such a large project.
I think it's a lot more common thing in biological sciences. I know that there's,
I forgot exactly, but there's an experiment that's looking at the biodiversity of a specific island
and they're tracking number of this specific bird for almost, I don't know, like 70, 80 years,
and it's still ongoing. They pass down PI to PI. Every generation has a different PI and they keep
03:08
carrying forward this experiment. It's really like, you always know that science that you're
participating is bigger than yourself, but in that kind of experiment, that's really real,
that feeling that this experiment is bigger than you.
Yeah, yeah, yeah.
I don't think I'll be in that kind of field to be, well, actually, well, in cultural heritage,
that's kind of like that, but at least in my PhD research, it was more about innovation and more
about new ways to look at the same thing and same question. If anything, you can't compare the
signal to noise ratio or the quality of spectra from 30, 40 years ago. Those would be completely
unpublishable nowadays, given how much electronics have improved, computer processing has improved.
Those would be not acceptable quality, but back then they could be like,
here, this is the noisiest spectra, but I got the signal.
Yeah, just very interesting. I was also going to ask, do you wish that you,
your experiment involves both human elements, but there's also machine limits, right? Instrumental
limits. How do you sort of, I guess you're not really trying to improve the instrument itself,
you're using the commercially available instruments, and are you making your own instrument?
We have to suggest to the vendor, because for us to measure, yeah, something that we need to see.
So we do collaborate with those vendors, but it's really challenging though, because our interests
need to converge. Sometimes there's a slight difference in our interests and it doesn't go
as we planned. The instruments that we wanted are not really delivered soon,
because they're not interested in so much.
My physical chemistry is an interesting field where both commercially available
06:05
technology instrument is heavily in use, but also you can do a lot of homemade instruments, right?
So I would say for my entire setup, the laser part, we didn't build the oscillators, right?
We bought it from a company, the laser itself, but everything downstream from laser,
like all of the routing system, the delay stage, the overlapping mechanism,
the molecular beam and detection scheme, everything after that was basically homemade.
So it was very interesting. I learned both the commercial instrument and once or twice in my
entire time there, the problem was too much for myself to handle, so we got a service engineer
to come and help us fix the problem. And it was really cool to work with him, learn his tricks
from working with this instrument, particular design of instrument for his entire career as a
field engineer. So that's really cool. But also I appreciate that the homemade instruments,
as long as you know the mechanism and the design really well, if there's a problem, you can take
it apart and try and fix it. Because the commercial elements, I did learn a lot of
commercial secrets of how to fix this laser many of the times, but there are definitely elements
that they would not let us touch. If we do, then there's no warranty that they can fix it back.
So there comes a point where I hit the wall and be like, okay, I need to call the engineer.
But the homemade instrument, you can take it apart as much as you want and get to the bottom
of the problem, which is very annoying most of the time. And it's very time consuming.
And especially in my case, the large part of the instrument was not my design,
right? It was designed by someone else way, way before me, like maybe 10, 15 years ago.
And technology is different, the design principle is different. Just because he did this 15 years
ago, it doesn't mean that that's the best way. And there are many sort of questionable design
choices. I was like, I wouldn't do that, you know, but or maybe that was the only way back then with
the budget that the lab had or something. So it's a bit of detective job. And it's annoying. But I
09:00
see kind of like both ways where I appreciate that there's some commercial elements, but I also
appreciate homemade instrument. Because my friend, who was also doing physical chemistry, his lab,
his lab's policy was basically build everything. So he built like the laser itself, which is
a lot of work. It's, you know, for a new grad student to learn. And making a laser doesn't
get you a paper because, you know, laser is already being built. Like that's not a new technology.
That's not his question. His question is what can this laser do experiment wise to be able to do
things right. So I always thought it was a bit unfair that he had to spend like two, three years
of his grad school, largely working on making sure that this homemade laser works.
And he had so little time to work on the actual science of it.
So yeah, so this could be another topic. But you know, each lab has different styles in terms of
what it requires. Like, I know, there's some big labs that will require, you know,
building everything from scratch. They'll ask everything to do by themselves, like even
students. But they will learn each step. Yeah, in details. And when they become a PI,
they can do anything, you know. Because they were given the chance to build their own instruments
and everything. But in some cases, for example, in medical fields,
the situation could be very different. But this could be another topic.
Yeah, yeah. I guess it's not really about data anymore. It's just kind of
no. Yeah, like just our experiment life. But like,
it was so finicky, because it's old. And it's, it's robust when it's working. And
but it's not working most of the time. And I thought that was just like, specific to my lab,
because it was like, you know, I don't know, bad design, or I'm not good at handling this instrument.
But a person who graduated from my lab, there was an alumni came to visit my lab, maybe,
I think he graduated like 10 years before, like my time. So he's like, significantly,
like older, he's very established national lab scientist. Now, he came back to give a talk in
my school. And you know, we chatted. And he's like, Oh, that instrument is still there. Like,
it's still working. Like, it was already falling apart when I was there. And I was like, Yeah,
12:00
we're still trying to make it, you know, keeping it alive. And he's like, I think the whole time
I was there, you know, five years that I was there, maybe it worked on total for two weeks.
And that's two weeks out of five years. And that's all of the data I got. And that's all I had in my
thesis. And I was hearing this in my like, second year of grad school, I was like, holy shit, I hope
it doesn't come down to that. You know, I hope I'm better than that. Or this instrument will behave
better than that. But when I had all of the days that I was able to get a usable data out, it's
probably also two to three weeks. I mean, yeah, like in terms of the like, because luckily my
molecule, I didn't have to run for three days or something. I ran maybe one day per condition. And
and I had six different wavelengths. We try a few different temperature setting and pressure settings.
So like, yeah, in total, maybe okay, not like two weeks, but more like three to four weeks.
Total of the entire five years that I was there for that worked because, yeah, because like,
I only have basically like two sets of like molecule like to know, I don't know, it's not
like two papers, but like two separate independent question sets of data that was usable. I did a
whole bunch of other things that just didn't work. But the instrument was in a working condition for
maybe three to four weeks. So in my field, it was really, really hard to get a data.
But once you get a data, it was usually publishable. Like I didn't have to worry
about people scooping me, because my instrument is so specific, and so unique. So if I get a good
data out of this and have good data interpretation, then I knew that I almost always can't publish
this, which is different for some people, right? Like some people, it's like they when
the data is relatively easy to get. But because a lot of people have access to that instrument
or that technique, you know, you have to be the first one, you have to be the best one.
Like, I don't know which one I like more.
Interesting. Yeah. I mean, what would you say your experiment is like?
Hmm, I have both sides, probably. One, it's easy to collect data.
15:01
Yeah. But the idea, you know, it's always about ideas.
Like concepts, what you... The question itself.
The question itself, how novel, how disruptive, in a good way. Yeah. How surprising it could be,
and which angle you want to, you know, shed the light on. Yeah. It's always about the research
question, but it's also about the techniques that we develop to some extent. That could be also
important because that way we can actually show what we couldn't before. But the most important
thing is there is a question that we ask the concept, and that's the most difficult
to deal with. Yeah. So yeah, it's a different sort of... The data, like we worry about data quality,
but also the rarity of the data, you know, how new, how difficult it is to get the data itself.
And then there's a whole other dimension of, are we analyzing this correctly? Right? Which
I feel like our chat is going for far too long already. But that's going to be a bigger question
for me, how to analyze this in my postdoc. And this current paper that I'm writing,
we ended up having to do a lot more than our usual sort of standard protocol of analysis,
because my data was pretty nice, like had good signal to noise ratio, but had this very unusual
feature that we were like, where is this coming from? This signal is not random. It's there,
but we don't know where it's coming from. And we need to be able to explain this and sort of tease
out, separate this unusual feature from the feature they were looking for in order to really
say anything about the feature that we are looking for. So I ended up having to have another student
who is much, much better at sort of that kind of statistical modeling than I am to
mathematically try and understand and separate these unusual features from the features we're
looking for. That has been very difficult. It took us a long time to really be able to do that.
So that is part of the dissertation, for me, but I already know, you know, it's only been a month,
two months after my dissertation defense, I already know that what I presented is wrong,
based on this analysis. But you know what, that's science.
18:06
Yeah, that's all right.
Yeah, it's just, you know, we didn't know at the time of like, you know, May 2023, we didn't know
that that we didn't know the correct way to tease them out. And I think we now have a better way to
separate them out. We know what we know and what we cannot comment on. But it's still a sort of
better way that we went about processing this data. And we have a very different story than
what we initially envisioned. But nonetheless, not a lot of people have done this kind of careful
studies about this particular molecule. So hopefully, you know, it's still a useful
thing for the community. And yeah, and that that it gets written and gets published.
I gotta work on that. Yeah.
Yeah, I'm looking forward to your publication.
Me too.
That's it for the show today. Thanks for listening and find us
at EigoDeScience on Twitter. That is E-I-G-O-D-E-S-C-I-E-N-C-E. See you next time.
19:30

コメント

スクロール